Graduate/Entry Level

Senior Associate Site Reliability Engineer (SRE)

About the Team

This team is a member of the Tenant Lifecycle Engineering team. We are primarily responsible for deploying and supporting the Conversion Test environments. These environments mimic our customer environments which provide a platform on which pre-release code can be tested from a deployment and upgrade standpoint. This ensures there are no issues during the deployment of code to our customers (patch). Activities include but are not limited to:
Weekly deployment of code to the test environments
Maintaining, debugging and researching issues with various components of the deployment and the environment
Providing service to the Update Validation team (primary consumer of this environment)
Automating manual tasks and streamlining processes
We also work with the wider Tenant Lifecycle Engineering team in supporting the customer facing environments for our end customers including providing training opportunities for the larger team in the deployment process, patching our customer environments, and pursuing resolutions to issues found during that patch process.

 

About the Role

  • Deploy and support the Conversion Test environments
  • Gain familiarity with the Workday architecture and stack in order to analyze and trouble-shoot issues
  • Collaborate with multi-functional teams to develop and implement solutions
  • Assist in implementing the next-generation orchestration tool in the Conversion Test environments
  • Develop, support and improve utilities that automate manual tasks and streamline processes
  • Work with a multi-national and diverse team
  • Work in a distributed data center infrastructure and cloud platforms

Basic Qualifications

  • 2+ years of solid operations experience
  • Proficiency with at least one scripting language, preferably Python
  • Extensive engineering experience with Linux (CentOS preferred)
  • Hands-on experience in distributed environments

 

Other Qualifications

  • Ansible experience a plus
  • Jenkins and Kubernetes experience nice to have
  • Understanding of software development best practices such as code management, CI/CD
  • Excellent analytical skills for troubleshooting and problem determination
  • Can work independently and with the idea that everything can be automated
  • Takes ownership of tasks and drives to resolution
  • Can multitask efficiently
  • Experience collaborating with multi-functional global and remote teams with a diverse set of backgrounds
  • Excellent documentation skills
  • The ability to work some nights and weekends is required as part of the production update rotation