Software Developer/ Engineer/ Architect

Senior Data Engineer

Description

 

Sitecore delivers a digital experience platform that empowers the world’s smartest brands to build lifelong relationships with their customers. A highly decorated industry leader, Sitecore is the only company bringing together content, commerce, and data into one connected platform that delivers more than 500,000 digital experiences every day. Leading companies including American Express, ASOS, Carnival Cruise Lines, Kimberly-Clark, L’Oréal and Volvo Cars rely on Sitecore to provide more engaging, personalized experiences for their customers. Learn more at Sitecore.com

Position Summary: 
The Data Engineering and Analytics team is focused on making the large volumes of data ingested by the platform available for queries and analysis. Our challenges include building pipelines to ETL data from multiple sources, storage and schema design to improve query and ETL performance, all while ensuring our solutions are fully automated, cost effective and massively scalable for many clients with a lot of data. From the configuration of our analytical pipelines to the development of the jobs that run on them, we take full ownership of our features. We are entirely hosted on AWS and build upon services they provide such as AWS CDK+CloudFormation, Step Functions, EMR, Athena etc, with the ETL and enrichment jobs themselves written in Scala and using Spark 3. We are looking for a Data Engineer to join us and help improve our data pipelines.

Responsibilities:  

  • Help us to improve the scalability, reliability, automation and cost of our data pipelines
  • Design, develop, test, deploy, maintain and improve our software stack
  • Lead complex technical conversations and decisions
  • Own individual project priorities, deadlines and deliverables

Requirements:  

  • 5+ years of experience writing and deploying production code
  • Fluent in Scala and experience in Python and Java
  • Strong functional and object-oriented programming experience
  • Experience with Big Data/distributed processing tools such as Spark, Hive, Kafka, HDFS, columnar file formats (Parquet/ORC etc)

Knowledge of the following technologies would be a bonus:

  • Experience ETL-ing large volumes of data
  • Experience with Delta Lake
  • DevOps experience - not necessarily the AWS components listed above but experience with CI/CD and Infrastructure as Code tools is a plus
  • Experience with OpenAPI/JsonSchema/Swagger
  • Experience building microservices and RESTful APIs