We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Research Intern - Human Intelligence

Microsoft
United States, Washington, Redmond
Oct 26, 2025
OverviewResearch Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers, who pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment.Join Human Intelligence team in Windows Applied Sciences Group to advance multimodal face normalization and RGB+NIR recognition for secure sign-in experiences (e.g., Windows Hello). We explore techniques that transform diverse, cross-modal inputs into robust identity representations; improve invariance to pose/illumination/occlusion; and pursue compute-efficient perception for real-world deployment. We emphasize scientific rigor, reproducibility, and user-centric evaluation at production scale.
ResponsibilitiesResearch Interns put inquiry and theory into practice. Alongside fellow doctoral candidates and some of the world's best researchers, Research Interns learn, collaborate, and network for life. Research Interns not only advance their own careers, but they also contribute to exciting research and development strides. During the 12-week internship, Research Interns are paired with mentors and expected to collaborate with other Research Interns and researchers, present findings, and contribute to the vibrant life of the community. Research internships are available in all areas of research, and are offered year-round, though they typically begin in the summer.Additional ResponsibilitiesResearch & prototype methods in areas such as: Unified face normalization across modalities (e.g., RGBNIR), with joint prototype + feature learning and cross-modal alignment. Multimodal face recognition (fusion across RGB, NIR, depth/IR, audio cues where appropriate), with robustness/fairness under distribution shift. Large Language Models-aided face verification: explore Vision Language Models (VLM)/Large Language Models (LLM) pipelines that (i) use visual context in the photo (attributes, scene cues, spatiotemporal hints) to assist verification; (ii) provide interpretable rationales; and (iii) improve failure detection and human-in-the-loop triage. Efficiency & reliability: distillation/quantization/pruning, lightweight encoders/normalizers, calibration and uncertainty, liveness/antispoof integration. Evaluate thoroughly: define datasets and protocols; run ablations and benchmarks (ROC, EER, TPR@FAR, latency/memory, fairness/robustness). Production immersion: learn Windows Hello-style pipelines (signals, constraints, on-device considerations) to align research with deployment. Publish: communicate results via talks, internal tech reports, and submissions to top venues.
Applied = 0

(web-675dddd98f-zqw5m)