About the Team
The Data Platform and Observability team is based in Pleasanton,CA; Boston,MA and Dublin, Ireland. We enable real time insights across Workday’s platforms, infrastructure and applications. Our focus is on the development of a large scale distributed data platform to support mission critical Workday applications.
The team provides software for collection, ingestion, storage & visualization of critical data assets. We handle 100s of terabytes of data in the form of billions of messages produced daily by Workday applications and underlying services. If you enjoy writing efficient software or tuning and scaling large distributed systems you will enjoy working with us.
Do you want to work on leveraging Workday’s vast computing resources with its rich and extensive datasets? To work with world class engineers and facilitate the development of the Observability data platform? If so, we should chat.
About the Role
- You will architect, design and build critical Kubernetes orchestrated Data Platform and Infrastructure services that need high reliability and availability at massive scale.
- You will design and develop core software modules used to build real-time and batch data processing.
- You will work with all aspects of data processing with a keen eye for data quality, data integrity and data availability.
- You will debug, solve and scale distributed systems. You will participate in the on-call rotation supporting the data platform.
- You love building distributed applications orchestrated through Kubernetes and that make use of innovative technologies such as Istio service mesh, containerisation, etc.
- You must currently be in a hands-on role. Must have strong coding skills (Ruby/Java/Scala/Python/Go).
- You must have a solid understanding of high performance data capture and collection systems, how to design APIs around these systems and how to design for reliable delivery of data.
- You have experience with building API services (REST/gRPC etc) that scale to millions of requests per second and are an expert at scaling such systems.
- You understand the internals of distributed systems like Kubernetes, Kafka, Spark, Flink, ElasticSearch, etc.
- Ability to prioritize multiple tasks in a fast-paced environment.
- You have strong communication skills both written and verbal.