Qualifications:
- 5 years of relevant experience in SRE, Apps Development or Systems Analysis role
- 3+ years of Amazon Web Services (AWS) administration
- 2+ years of experience within a high-performance, 24x7, DevOps or SysOps team
- Extensive experience of programming, deployment and operation of software applications
- Experience in implementing successful projects
- Ability to adjust priorities quickly as circumstances dictate
- You have proven experience in mentoring more junior team members
- Consistently demonstrates clear and concise written and verbal communication
Education:
- Bachelor’s degree/University degree or equivalent experience
- Master’s degree preferred
At Citi we’re passionate about building software that solves problems. We count on our site reliability engineers (SREs) to empower our users with a rich feature set, high availability, and stellar performance level to pursue their missions. As we expand our customer deployments, we are currently seeking an experienced SRE to deliver insights from massive scale data in real time. Specifically, we are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences at every interaction.
This is an excellent opportunity to join a new team and make a big impact across a large organization, and gain exposure to a wide variety of ICG businesses and technology initiatives.
Responsibilities:
- Embed into application teams on a rotating basis
- You will help organize, secure, and automate existing infrastructure and deployments
- Develop new capabilities, co-ordinating implementation across a large number of teams including infrastructure, developer tools and information security
- Adapt existing capabilities to the team's specific circumstances, technology stack and business requirements
- Learn from applications' internally developed best practices and help adopt them across the organisation as new capabilities
- Implement, monitor, and maintain CI/CD frameworks
- Prove and iterate capabilities by applying them to real world applications
- You will work closely with developers to provide feedback and drive operational improvements within our products and operations infrastructure
- Take part in post-incident review and help guide changes to avoid future incidents in a blame-free post portem
- Influence a culture of Site Reliability Engineering. Engage in training and mentoring to help develop other engineers with this mindset
- Automate, automate, automate
- Reporting on key performance (SLA) including Major Incidents, major projects
- Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency.