Groupon’s mission is to become the daily habit in local commerce and fulfill our purpose of building strong communities through thriving small businesses by connecting people to a vibrant, global marketplace for local services, experiences, and goods. In the process, we’re positively impacting the lives of millions of customers and merchants globally. Even with thousands of employees spread across multiple continents, we still maintain a culture that inspires innovation, rewards risk-taking, and celebrates success. If you want to take more ownership of your career, then you're ready to be part of Groupon.
At Groupon, we know that great people make great companies. We're a "best of both worlds" kind of company. We're big enough to have the resources and scale, but small enough that a single person has a surprising amount of autonomy and can make a meaningful impact. We're curious, fun, a little intense, and kind of obsessed with helping local businesses thrive. Does that sound like a compelling place to work?
The Data Engineering team at Groupon is at the heart of all things “data”, working on designing and building the next-generation data pipelines for data science/machine learning community users. Our mission is to empower data analysts & data scientists across all business units to make better business decisions. This role offers a unique combination of skills in computer science (distributed systems, big data), cloud, scalable, and high-performance production systems.
Key Responsibilities
- Design, and implement the data pipelines providing access to large datasets and transforming power for data across the org.
- Write complex but efficient code to transform raw & curated data into business questions-oriented datasets and data visualizations.
- Work with big data and distributed systems using technologies such as Spark, AWS EMR, GCP DataProc, and Python.
- Actively contribute to the adoption of strong software architecture, development of best practices, and new technologies. We are always improving the process of building software; we need you to help contribute.
- Interface with other technology teams to extract, transform, and load data from a wide variety of data sources using open sources and AWS/GCP big data technologies
- Explore and learn the latest GCP/AWS technologies to provide new capabilities and increase efficiency
- Collaborate with Business Users, Infra Engineers, Data Scientists to recognize and help adopt best practices in data gathering and transforming big data
- Identify, design, and develop new tools and processes to improvise the data storage and compute to help the Data Engineering and Data Consumption teams and users
- Interface directly with stakeholders, gather requirements, and owing automated end-to-end data engineering solutions.
- Provide technical leadership and mentoring to other engineers for best practices in data engineering.
Preferred Qualifications
- Bachelor’s degree in computer science, mathematics, or a related technical field
- 5+ years of relevant employment experience in data engineering or related fields.
- Good Experience in the programming language, preferably python.
- At least 4 years experience in Spark development
- At least 1 year of experience with Airflow, NiFi, Luigi, or Azkaban
- A clear understanding of testing methodologies and AWS Best Practices
- A big plus if you have GCP/AWS experience and/or Certification
- Proficient in big data technologies (e.g. Hadoop, Hive, Spark, EMR)
- Excellence in technical communication and experience working directly with stakeholders
- Experience maintaining data pipelines using big data technologies like Hadoop, Hive, Spark, EMR, etc.
- Demonstrated ability to coordinate projects across functional teams, including engineering and product management
- Knowledge of software engineering best practices across the development lifecycle, including agile methodologies, coding standards, code reviews, source management, build processes, testing, and operations