2026 Summer Intern, PhD, Software Engineer, DUE

Waymo
United States, California, Mountain View
1600 Amphitheatre Parkway (Show on map)
Dec 28, 2025
Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver-The World's Most Experienced Driver-to improve access to mobility while saving thousands of lives now lost to traffic crashes. The Waymo Driver powers Waymo's fully autonomous ride-hail service and can also be applied to a range of vehicle platforms and product use cases. The Waymo Driver has provided over ten million rider-only trips, enabled by its experience autonomously driving over 100 million miles on public roads and tens of billions in simulation across 15+ U.S. states. The Driver Understanding and Evaluation (DUE) team at Waymo is focused on deeply understanding and assessing the performance of the Waymo Driver. They develop and utilize advanced evaluation methodologies, including machine learning-based metrics and reward functions, to analyze driving behavior and capabilities. Collaborating broadly with Simulation, System Engineering, Research, and Onboard Software teams, DUE plays a critical role in the validation and improvement of the Waymo Driver, including the evaluation of foundational AI models. Their efforts support the scalable and rigorous assessment necessary to ensure the safety and efficacy of Waymo's autonomous technology. Waymo interns partner with leaders in the industry on projects that create impact to the company. We believe learning is a two-way street: applying your knowledge while providing you with opportunities to expand your skill-set. Interns are an important part of our culture and our recruiting pipeline. Join us at Waymo for a fun and rewarding internship! You will: Implemented reinforcement learning algorithms and novel reward functions within a high-fidelity autonomous driving simulator Design and run experiments to train the driving agent, systematically collecting and analyzing performance data to identify behavioral patterns and potential instances of reward hacking Interactively refine the driving policy and the reward model based on experiment results, implementing and testing strategies to improve safety and alignment with intended driving behavior Collaborate with the research team to document experimental setups, present key findings, and contribute to the project's technical direction You have: Strong programming proficiency in Python and hands-on experience with deep learning frameworks (e.g, Tensorflow) Solid theoretical understanding of machine learning and reinforcement learning fundamentals, including Markov Decision Processes value functions, and policy gradients Practical experience applying modern deep RL algorithms (e.g., PPO, SAC, TD3) to continuous control problems Familiarity with software development best practices, including version control with Git We prefer: Knowledge of advanced reward learning paradigms such as Reinforcement Learning from Human Feedback (RLHF) or Inverse Reinforcement Learning (IRL) Experience with large-scale or distributed training of machine learning models Note: This will be a hybrid onsite internship position. We will accept resumes on a rolling basis until the role is filled. To be in consideration for multiple roles, you will need to apply to each one individually - please apply to the top 3 roles you are interested in. The expected hourly rate for this full-time position is listed below. Interns are also eligible to participate in the Company's generous benefits programs, subject to eligibility requirements. Hourly PhD Pay $85 — $85 USD