New
Principal Data Engineer, Supply Chain
Microsoft | |
remote work | |
United States, Washington, Redmond | |
Jan 22, 2025 | |
OverviewIn alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day. Microsoft Cloud Operations and Innovation (CO+I) is the engine that powers our cloud services.Our infrastructure is comprised of a large global portfolio of more than one hundred datacenters and one million servers.Our foundation is built upon and managed by a team of subject matter experts, who work tirelessly to support digital services for more than one billion customers and twenty million businesses in over ninety countries worldwide. Within CO+I, the Core Datacenter Services (CDS) team is responsible for improving overall availability and efficiency for Microsoft's cloud business. We have multiple teams focused on both Microsoft and Lessor datacenter performance and all aspects of datacenter utilization effectiveness that includes power, water and labor. We are seeking a Principal Data Engineer, Supply Chain to support the Operational Supply Chain across datacenters globally. This role will design and execute a data model for the critical environment data center supply chain. Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. This role is located either in one or all hub locations - Atlanta, GA, Washington, D.C., Redmond, WA, San Antonio, TX or Phoenix, AZ. Relocation support will be provided, and successful candidates will need to relocate or reside within 50 miles of the hub office location. This role is eligible for hybrid or remote work, up to 100%.
ResponsibilitiesData Governance: Ensure data modeling and handling comply with laws and policies. Develop processes to document data type, classifications, and lineage for traceability. Govern data accessibility across pipelines and oversee data glossaries.Data Tools Optimization: Identify opportunities to optimize data tools for transforming, managing, and accessing data. Write code to test data platforms and implement sustainable design patterns. Identify trends to inform future data architecture designs.Data Extraction and Validation: Extract raw data from multiple sources using query languages, tools, or machine learning algorithms. Ensure data accuracy, validity, and reliability. Contribute to code reviews and drive the business case for advanced orchestration techniques. Plan and strategize data protocols and aggregation approaches to validate data quality.Data Transformation: Develop and use advanced techniques to transform raw data into compatible formats for downstream sources. Expand the application and reusability of software and tools. Drive efficiencies in data extraction to ensure quality and completeness.Stakeholder Collaboration: Collaborate with stakeholders to recommend data requirements. Partner with business teams to determine data costs, access, usage, and availability. Negotiate agreements with partners and system owners for project delivery and data ownership.Data Modeling: Design data models that meet business requirements and translate business needs into design specifications. Lead conversations with stakeholders to improve data models and schemas. Develop solutions considering analytical requirements and compute/storage consumption.Root Cause Analysis: Conduct root cause analysis for detected problems and implement solutions to prevent recurrence. Monitor self-healing processes to maintain data quality and performance. Use cost analysis to drive solutions and reduce budgetary risks.Performance Monitoring: Ensure effective performance monitoring across data pipelines. Build automation into data visualizations and aggregations to monitor data quality and pipeline health. Develop troubleshooting guides and operating procedures for addressing complex problems flagged by automated testing. |