Dynatrace is hiring a

Senior Incident Commander - Site Reliability Engineering

Detroit, United States
Full-Time

We are strengthening our incident management team. You will be at the helm, managing incidents and leading the way. Your role at Dynatrace is crucial in ensuring best-in-class reliability and shaping incident response for our customers. Your detailed responsibilities in this new team will be  

Prepare for Effective Incident Response: 

  • Response Coverage: Join a new global team of Incident Commanders coordinating incidents 24/7 in a follow-the-sun model
  • Training and Preparedness: Train teams on incident response protocols and ensure readiness for critical incidents
  • Process Improvement: Ensure our incident management process fits best-in-class, aligning with industry standards, company, and customer need

Navigate Critical Incidents with Success:

  • Incident Coordination: Manage high-severity incidents, leading temporary response teams to ensure timely resolution and minimal business impact.
  • Analysis and Mitigation: Coordinate the team to understand impacts, perform forensics, categorize and mitigate incidents, ensuring the right experts are engaged.  
  • Communications: Ensure all personnel know their roles during incidents. Keep teams aligned and ensure regular updates to customers and internal stakeholders. 

Continuously Learn and Improve: 

  • Postmortem Management: Lead blameless postmortem sessions, reviewing incident response and resilience, and tracking execution of improvement actions
  • Metrics and KPIs: Define and track key metrics to measure the effectiveness of incident management and leverage them for data-driven improvement planning.
  • Customer Interaction: Prepare detailed postmortem write-ups for customers, providing clear and actionable insights. Monitor and report on SLAs.
  • Stakeholder Communication: Maintain a holistic view of production status and communicate updates to internal stakeholders and customers.
  • Proven experience in incident management and SRE or Security Operations, ideally within a SaaS environment.
  • Strong technical background with the ability to understand complex systems and troubleshoot issues.
  • Strong team player who stays calm and keeps the focus for the group in tough situations.
  • Excellent communication skills, both written and verbal, with the ability to convey technical information to non-technical stakeholders.
  • Experience with postmortem processes and continuous improvement methodologies.
  • Ability to work in a fast-paced, dynamic environment and manage multiple priorities.
  • Passionate about pushing the limits to operate a vast SaaS solution reliable and performant at scale!  

Minimum Qualifications

  • Must be a US citizen.

 

All your information will be kept confidential according to EEO guidelines.

We offer competitive compensation, company-sponsored premium benefits, medical, dental, vacation/holidays, company matching 401(k) Plan, etc. Dynatrace is an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, sex, color, gender identity, religion, national origin, ancestry, citizenship, physical abilities, age, sexual orientation, creed, disability status, veteran status, pregnancy, genetic status, or any other characteristic protected by law.  If your disability makes it difficult for you to use this site, please contact [email protected]. Dynatrace participates in E-Verify, participant information in English and Spanish. Right to work information in English and Spanish. EEO is the Law/EEO is the Law Supplement. To be considered for this position, please upload your resume/CV.

 

Apply for this job

Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!

Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Site Reliability Engineer Q&A's
Report this job
Apply for this job