Graduate/Entry Level

Senior Site Reliability Engineer - Service Engineering

About the Role

Are you a creative SRE looking for more opportunities to automate and improve reliability, or an innovative Software Developer that enjoys building solutions to reduce toil and manual effort? 

With constant attention and focus on our customers (both internal and external), you will deliver quickly on a wide range of daily tasks - from environment provisioning, performance monitoring, environment troubleshooting, ad-hoc requests and automation efforts; while providing transparency of work being performed.

This role requires a good understanding of Linux systems in a Production Environment as you will be part of a team that writes and maintains scripts (bash, ruby, python) that support public and private cloud environments.

About You

We would love to hear from you if you like trying new techniques and approaches to sophisticated problems, love to learn new technologies, are a natural collaborator and a phenomenal teammate who brings out the best in everyone around you.

You understand that availability of Workday Service is paramount and requires on-call participation, careful planning of changes, detailed runbooks and effective teamwork. If the work performed is manually repeated often, you find a way to automate the task. More so, you deliver!

Basic Qualifications

  • 5-7+ years of experience running and maintaining a 24x7 large-scale production environment, preferably across multiple data centers
  • BS or MS degree in Computer Science, Engineering, or related technical field, or equivalent experience
  • Experience deploying and operating: Apache Tomcat and HTTPd, MySQL, Cape Clear ESB, Subversion, Java web applications
  • Proven expertise with Linux, debug fundamentals and have a solid understanding of how to quickly isolate issues.
  • Experience with many tool sets: Chef, Puppet, OSSEC, SPLUNK, SOLAS, Bladelogic, Ansible, JIRA, Confluence

Other Qualifications

  • Strong understanding of enterprise level thinking on a few levels; documentation, runbooks, root cause analysis, capacity-trending, bug fixes and scripting
  • Secret passion about monitoring. When false positives show up on your radar you quickly address it. Your inner wish list is to "make monitoring phenomenal again".
  • Can balance multiple tasks, make the right business decisions and tackle problems while under pressure, and prioritize and organize effectively.
  • Able to work some nights and weekends is required as part of the on-call support and production update rotation.
  • Experience with (CentOS, SunOS, Solaris/Linux/DevOps) is a plus