Sales/ Marketing

Senior Linux Systems Administrator

What you get to do in this role:

Team

As a key member of the Systems Administration team within Global Cloud Operations, you will be responsible for administration and operations of the global cloud infrastructure that runs our SaaS product.

This is an opportunity to be at the core of running a Cloud SaaS platform that scales to millions of users! The Cloud Operations team is responsible for availability and efficiency of the server infrastructure that runs our SaaS platform, while consuming and deploying products that have been newly developed by engineering teams.

 

Role

  • Contribute to Configuration Management and Infrastructure as Code for ServiceNow’s global private cloud.
  • Develop tools in Python, bash, and JavaScript to replace manual work and improve customer maintenance experience.
  • Drive enhancements and bugfixes for large scale automation projects such as patching, provisioning, and kickstart domains.
  • Design and implement procedure to accomplish maintenances where automation and tooling cannot; drive resolution of root causes with internal team members.
  • Prepare new ServiceNow products and services for production readiness with design review, feedback to engineering teams, training, and testing.
  • Use broad knowledge and experience of systems administration and networking principles to proactively prevent and address incidents while constantly improving documentation.
  • Participate in escalations and Root Cause Analysis of issues.
  • Troubleshoot database backup and restore failures as well as perform database migrations.

Support operation of a wide variety of infrastructure services including Machine Learning and Prediction, Cloudera Big Data clusters, Kafka and RabbitMQ messaging, database encryption, E-Mail infrastructure at scale, DNS, Puppet, Elasticsearch, F5 BigIP and more.

To be successful in this role you have:

The ideal candidate will have a strong background in systems administration and engineering, understanding of the components of a cloud infrastructure including hardware platforms, OS, applications, databases, networks, web and application servers. Experience in Site Reliability Engineering/DevOps and managing large-scale server infrastructure at a cloud computing or MSP setting is highly desirable. Strong Linux & scripting expertise is a must. Candidate must have good communication skills and work well in an open, collaborative, dynamic team environment.

  • Solid experience with Linux (RedHat and/or CentOS)
  • Strong working experience of one: Perl, Python, Java/JavaScript
  • ServiceNow development experience is desirable.
  • Strong experience with service troubleshooting, covering web front-end, Systems, Databases and Networks.
  • Previous direct exposure to administrating fundamental internet services (DNS, Mail, Apache/Tomcat) with a good understanding of with a strong understanding of application server stack principles such as the LAMP stack.
  • Some experience administrating MySQL, MariaDB or similar database engine technologies.
  • Experience working with cloud-based vendors like Microsoft Azure, Amazon Web Services is desirable.
  • Familiarity with Networking Technologies such as routing, switching and load balancing (VPN exposure is a huge plus)
  • Experience with systems and network performance and availability monitoring and analysis as well as configuration management platforms (Nagios/Icinga, Cacti, Netcool, Monolith, Puppet, cfengine, chef, Splunk, Logstash, Ansible) is desirable.
  • Understanding of ITIL v3 framework and how it applies to incident, problem and change.
  • Bachelors Degree in Engineering or Computer Science (or equivalent experience)