Technology, science and job news

Site Reliability Engineer - DevOps - Payments Card Technology

JP Morgan
Dublin, Ireland
August 14, 2021

Site Reliability Engineer

As a Site Reliability Engineer (SRE) you will help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. Much of our support and software development focuses on optimizing existing systems, building infrastructure and reducing work through automation. You’ll join a team of curious problem solvers with a diverse set of perspectives who are thinking big and taking risks. In this environment you’ll take the lead on relevant projects, supported by an organization that provides the support and mentorship you need to learn and grow. As an SRE you’ll be focused on running better production applications and systems.
Responsibilities:

Design, code, test and deliver software to automate manual operational work.
Troubleshoot priority incidents, facilitate blameless post-mortems and ensure permanent closure of incidents.
Engage with development team throughout the life cycle to help develop software for reliability and scale, ensuring minimal refactoring or changes.
Identify application patterns and analytics in support of better service level objectives.
Design self-healing and resiliency patterns.
Design automated software and product upgrades, change management, and release management solutions.
Coach or manage teams as applicable.
Participate in the 24x7 support coverage as needed.
Expertise in Incident, Problem and Change Management processes and tools.
Collaborate across Application Development, Product and production management to establish and maintain Service Level Objective (SLO), Service Level Indicator (SLI) and Error Budget for key Production services.
Implement required telemetry and ability to monitor and measure the quality of service in real-time against the established SLO.
Manage, track and validate all changes to the Production, Disaster Recovery environment.
Manage priority incidents and leverage cross-functional teams to quickly eliminate impacts.
Escalate issues/Risks effectively when necessary across supporting framework.
Ability to align IT service offerings with business strategies, goals, and objectives.
Troubleshoot Key technical issues or escalate and work with appropriate technology teams to provide solutions.
Aggressively respond to service requests from Client facing support teams, Operations partners, etc.
Manage application and infrastructure to maximize stability and resiliency. Leverage and improve monitoring and alerting capabilities to ensure application SLAs are met.
Strong focus on automation and processes. Design, implement, improve and utilize key monitoring tools.

Bachelor’s degree or equivalent experience in an software engineering discipline
Expertise in at least one technology stack designing, coding, testing, and delivering software
Proficiency in one or more technology domains, may be a cross-domain expert able to solve complex and mission critical problems within a business or across the firm
Working knowledge of infrastructure components. (E.g. routers, load balancers , cloud products , container systems , compute, storage and networks)
Excellent debugging and trouble shooting skills
Expert in performance monitoring and capacity management of large systems using various tools
Deep level expertise in instrumentation, customization and usage of modern monitoring toolset such as Dynatrace, AppDynamics, Grafana, Prometheus, ThousandEyes, Splunk, Geneos etc.
Expert in at least one technology stack (Java/J2EE/C#.NET) with designing, coding, testing, and delivering software
Exposure to Python and willing to be learn and be Expert in Python Technology for Creating Application Health Dashboards, Machine Learning Projects
Expert in at least one of the relational database (SQL Server, Oracle, DB2 etc.)
Working knowledge of Groovy, Batch scripting, Ansible, PowerShell or Shell Scripting
Working knowledge of infrastructure components like routers, load balancers and networks
Comfortable working in Agile mode and proficient in Continuous Integration and Continuous Delivery
Solid understanding of object oriented design methodologies
Solid analytical and problem solving skills
Attention to detail and time-management skills
Payments and Card experience a big plus
Regulatory experience a plus: PSD2, Strong Customer Authentication, EBA, etc.

Apply

Site Reliability Engineer - DevOps - Payments Card Technology

Related News