Remote

Sr. Site Reliability Engineer, Incident Excellence

Hashicorp
United States
Nov 12, 2024
Our Organization HashiCorp helps solve development, operations, and security challenges in infrastructure so organizations can focus on business-critical tasks. We build products to give organizations a consistent way to manage their move to cloud-based IT infrastructures for running their applications. We use the Tao of HashiCorp as our guiding principles for product development and operate according to a strong set of company principles for how we interact with each other. We value top-notch collaboration and communication skills, both among internal teams and in how we interact with our users. Our Team The HashiCorp Incident Excellence team is responsible for improving HashiCorp's incident response while maximizing learning from incidents. Our focus is on helping all engineers feel confident when they are on-call and improving communication to efficiently resolve incidents and build trust in our brand. We partner closely with teams to drive a holistic incident management strategy and share learnings to help our business continuously improve. About this Role This engineering role is on a nascent engineering team. The team is responsible for products that touch many areas of engineering organizations at HashiCorp, so applicants will need to excel at collaboration, have product-focused mindsets, and be comfortable iterating in an agile manner towards solutions. In this role, you can expect to: Utilize your professional software engineering experience to periodically solve problems, build automation, and create components of our incident lifecycle management processes. Coordinate disaster recovery processes and identify strategic process improvements. Be responsible for and drive incident management capabilities and culture. Participate in incident command on-call rotation. Support incident management tooling. Build technical skills and relationships within a team of engineers and SREs. Learn, teach, and collaborate cross-functionally. You may be a good fit for our team if: Professional experience designing or operating disaster recovery processes in a distributed cloud environment. Professional experience with incident management in cloud environments. Enjoy working on a variety of scopes spanning software engineering, cloud infrastructure, and SRE. Experience contributing to efficiency improvements of software at scale. Experience collaborating cross-functionally to deliver engineering culture change. Worked on infrastructure teams in customer-centric and agile organizations with empathy and compassion Worked with SaaS or another type of managed software offering Experience in one or more of the major public clouds #LI-Remote Individual pay within the range will be determined based on job related-factors such as skills, experience, and education or training. The base pay range for this role in the SF Bay Area / NYC area is: $176,500 - $207,600 USD The base pay range for this role in Seattle Metro, Denver / Boulder Metro, New York (excluding NYC), Washington D.C., or California (excluding SF Bay Area) is: $161,800 - $190,300 USD The base pay range for this role in Colorado (excluding Denver / Boulder Metro) and Washington (excluding Seattle Metro) is: $147,100 - $173,000 USD