We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results

Principal Hardware Engineer

Microsoft
United States, California, San Diego
Oct 28, 2025
OverviewMicrosoft Silicon Engineering Solutions and Cloud Hardware Infrastructure Engineering (SCHIE) is the team behind Microsoft's expanding Cloud Infrastructure and responsible for powering Microsoft's "Intelligent Cloud" mission. SCHIE delivers the core infrastructure and foundational technologies for Microsoft's over 200 online businesses including Bing, MSN, Office 365, Xbox Live, Skype, OneDrive and the Microsoft Azure platform globally with our server and data center infrastructure, security and compliance, operations, globalization, and manageability solutions. Our focus is on smart growth, high efficiency, and deliver trusted experience to customers and partners worldwide and we are looking for passionate, high-energy engineers to help achieve that mission. Azure Memory and Storage Center of Excellence (AMS COE) is part of the SCHIE organization focusing on Memory and Storage devices going into the cloud hardware servers. AMS provides memory and storage solutions to Azure, drives memory and storage suppliers to deliver high quality products meeting our requirements. We are looking for seasoned engineers with a passion for customer-focused solutions, insight and industry knowledge to architect and specify memory and storage hardware solutions that optimize for quality, reliability, cost, and performance. As a Principal Hardware Engineer, you will apply system and Memory/Storage expertise to provide best-in-class support for Azure cloud servers. You will combine skills in data analysis, hardware debugging, and subject-matter expertise to ensure hardware in Azure is in service and fully operational. To achieve this, you will collaborate closely with other hardware and software engineers at Microsoft worldwide. In these collaborations you will prepare and present data analyses, provide recommendations, develop solutions to address problems, and define solution requirements.
ResponsibilitiesDevelop comprehensive validation plans and strategies for PCIe Switch, Retimer, and related storage components (NVMe/SSDs).Define, prototype, and document test requirements and specifications for productization and supplier implementation.Implement and execute tests to reproduce and analyze failures from qualification or production environments.Collaborate with system and fleet support teams to perform detailed root-cause analysis of PCIe and storage issues.Design and build advanced automation frameworks using Python, C, and C++ for test orchestration, data processing, and reporting.Create scalable and modular test frameworks and tools for automated test execution, result parsing, triage, and dashboard generation.Provide technical leadership in validation architecture, automation design, and test methodology.Partner with internal teams and test development houses to establish standardized validation suites and consistent methodologies for PCIe components.Perform deep failure analysis using PCIe/SAS protocol analyzers, oscilloscopes, and other diagnostic instruments.Lead board-level bring-up and debug using JTAG, serial interfaces, and kernel-level tools across Windows and Linux platforms.Support validation and qualification for data center, GPU, and AI/ML platforms integrating PCIe Switch and Retimer components.Ensure readiness and interoperability across multiple PCIe generations (Gen4/Gen5/Gen6) and storage technologies (NVMe/SAS/SATA).Serve as the subject matter expert for PCIe Switch and Retimer validation, driving end-to-end ownership and improving fleet reliability and scalability.
Applied = 0

(web-675dddd98f-24cnf)