Primary Responsibilities
· Work with multiple engineering teams to design, build, maintain and support SaaS-based services and associated toolchain
· Ensure infrastructure and services maintain the required level of security, availability, reliability, scalability, and performance
· Build incident management, operational monitoring and alerting capabilities to proactively report, troubleshoot, and fix problems
· Assist in achieving and maintaining industry security audit certifications (ISO, SOC2, HIPAA)
· Build automation around the infrastructure and services used in product development, testing, and CI/CD pipelines
· Function as a subject matter expert to internal and external partners and stakeholders
· Research new technology areas, innovations, and ideas
Knowledge, Skills and Abilities
· Experience setting up and using incident and on-call management systems
· Experience setting up and building tools to collect and visualize data, building dashboards, alerting and monitoring systems
· Experience with deploying secure infrastructure and services in one or more cloud environments such as AWS, Azure
· Experience with configuration management and deployment automation tools, such as Terraform, Ansible, Packer, etc.
· Experience with CI/CD pipeline tools like GitLab-CI or Azure DevOps
· Experience with container (Docker) and orchestration systems (Kubernetes)
· Proficiency in scripting languages such as Python and Bash
· Solid understanding of Unix OS
· Good understanding of networking fundamentals: TCP/IP, HTTP, DNS, load balancing, firewalling, etc.
· Proven ability to act as a leader in communicating conceptual ideas and designs
· Demonstrable ability as an excellent communicator and collaborator who can work well with teams and engineers across different sites
Qualifications
· Bachelor's (or higher level) degree in one or more of these disciplines: Electrical Engineering, Computer Engineering, Computer Science or related fields
· 3+ years of professional experience as a SRE or Devops engineer