Software Developer/ Engineer/ Architect

Senior Site Reliability Engineer - Databases

The Position

Who We Are:

As a member of the organisation you will be dedicated to improving the reliability of our end-to-end data and database infrastructure. Your work will integrate directly with our products

Our core infrastructure receives hundreds of millions of tweets per day and serves tens of billions of API requests. We also serve over 2+ billion search queries per day, render millions of ad impressions, and process hundreds of terabytes of log and interaction data daily

We investigate difficult operational issues; from the software, systems, automation, and process perspectives. We will understand the challenges around integrating disparate infrastructures into a new facility, processes and procedures

We develop services and tooling to automate repetitive tasks and/or provide self-service applications

We actively participate in the vision to move away from high operational cost tasks, and contribute to services that can shrink and expand based on demand, self heal, automatically rollout, etc

We will train and invest in our team members and make sure that they are successful in supporting a large variety of systems and products

We’re looking for an industry-experienced SRE to join us and help us further Twitter reliability by building services and automating some of our biggest operational tasks. The candidate must have relevant experience building and operating production systems, as well as a strong programming background.

Your responsibilities include:

Working closely with engineering teams to design, build, and maintain systems and help them decide on database to use, schema design and query tuning

Using your expertise to tune and push our databases beyond their normal limit

Solving issues across the entire stack: hardware, software, application and network

Mentoring other SREs on standard methodology for everything, from monitoring to solving complex code and database issues

Identifying and driving opportunities to improve automation for the company; scope and build automation for deployment, management and visibility of our services.

Actively participate and contribute to code reviews and technical design documents, with an eye toward identifying performance and reliability bottlenecks

Representing the SRE organisation in design reviews and operational readiness exercises for new and existing services

Participating in on-call (24x7) and customer support (8x5) rotations

 

Company Description

Twitter is committed to serving the public conversation by helping people stay informed, inform others, and discuss what matters. Our Curation team is on a mission to better facilitate this through the curation of the best, most relevant, and timely content that reaches, engages and delights one of the largest daily audiences in the world.

Twitter’s Curation team sits within its Consumer Product team, which works directly on developing the core Twitter product with our customers in mind.

 

 

Additional Information

We care about making work happy and productive for everyone, with a permanent option to work remotely or regularly work from home when our offices reopen; a home office expense budget; wellness benefits; regular #NoMeetingFridays; and up to 20 weeks of parental leave. 

A few other things we value:

Challenge - We work with Twitter's product and standards teams to solve some of the industry’s hardest content problems. Come to be challenged, learn, and thrive as a curator.

Diversity - Diversity makes us a better organization and team. We value diverse backgrounds, ideas, and experiences.

Work-Life Balance - We work hard, but we believe with hard work should come balance.

We are committed to an inclusive and diverse Twitter. Twitter is an equal opportunity employer. We do not discriminate based on race, ethnicity, color, ancestry, national origin, religion, sex, sexual orientation, gender identity, age, disability, veteran, genetic information, marital status or any other legally protected status.

Qualifications

Who You Are:

5+ years of proven experience in doing software support, reliability, or operations engineering experience in production environments

Strong proven ability to write modular and well-tested code in Python or Go

Experience in driving and delivering multi quarter cross team projects to completion 

Demonstrated ability supporting any or all of the following: Any relational databases (MySQL, Postgres etc), Hadoop, Druid, BigQuery and other data management services on-prem and on public cloud

Ability to work well with and be able to influence a myriad of personalities at all levels

Adaptable and able to focus on the simplest, most efficient & reliable solutions

Have a track record of successful practical problem solving, excellent written and social communication, and documentation skills

Desired: Ability to lead technical teams through design and implementation across an organization.

Desired: Experience with open source projects like Vitess, Orchestrator, Percona, Airflow and other database tools