Software Developer/ Engineer/ Architect

Incident Management Engineer

The Incident Management Engineer leads critical incident response efforts across many technical domains. This role involves partnering with the entire organization to drive action for incident resolution and fostering learning from incidents to implement future analysis, resilience, and reliability engineering efforts. As a member of the Centralized Incident Response team, the Incident Management Engineer is customer-focused, technically competent, and business-oriented.

Salesforce, the leading provider of cloud computing, uses a global network of internal and outsourced technical support engineers to deliver world-class, multi-language technical support to over 1 million subscribers. With our continued rapid growth, we are searching for strong engineers to join our Centralized Incident Response team. The Centralized Incident Response team is a dedicated group of elite technical incident commanders who will be responsible for managing critical incidents for Salesforce products, with the goal of mitigating customer-impacting incidents with maximum efficiency. This team will evaluate every incident, perform analysis, and establish metrics with a goal to continually learn and improve. 

Key Responsibilities:

  • Provide expert execution of the incident command process, including running and managing high severity incident bridges and driving transparent communication that promotes maximum levels of internal/external customer satisfaction
  • Work directly with stakeholders and executives to drive resolution during incidents and improve overall response for future incidents
  • Lead cross-functional post-incident process reviews and incident analysis and drive continuous improvement of operations and execution
  • Lead enterprise-wide drills to prepare for and ensure efficient incident response and drive best practices
  • Closely partner and collaborate with Infrastructure, Engineering, Operations, Technical Support, Customer Success, and Sales Leadership to ensure alignment across the business

Required Skills/Experience

  • BS in Computer Science or a related technical area
  • 10+ years experience in a global application delivery/SaaS environment, handling highly complex issues
  • 5+ years managing, coordinating, and ensuring resolution on major incidents
  • Deep experience responding to and leading complex incident response in a 24/7/365 environment
  • Expertise in managing enterprise-level escalations, including managing, prioritizing, and delegating multiple escalations at once.
  • Ability to execute with a high level of operational urgency, maintain calm, and work closely with a team and stakeholders during a critical situation.
  • Strong operational and services experience in a cloud services delivery environment
  • Outstanding communication skills at the C-Level: Both written and verbal communications
  • Able to articulate technical issues in a meaningful way to both engineers and executive-level management.

Desired Skills/Experience

  • Customer-centric attitude and focus on providing best-in-class service for customers and stakeholders.
  • Familiarity with incident management frameworks (ICS/NIMS, ITIL, etc.)
  • Experience conducting incident investigations, including analysis of an incident as well as performance evaluations of responders
  • Flexibility, integrity, and creative problem-solving skills
  • Excellent project management skills, including demonstrated ability to manage projects across teams where influencing skills are required.
  • Previous experience directing and maximizing the benefits of collaborating with global teams