The Observability organization at Citi is responsible to provide full stack observability solutions (metrics, events, logs, traces) to infrastructure, security, and applications SRE, operations and development teams covering over 10,000 application instances.
The observability solutions utilize products like Splunk, Elasticsearch, Grafana / Prometheus, AppDynamics, NOI, and ITM6 hosted on Linux machines. These products are managed using a robust ansible based deployment model. We are looking for a data engineer to join the team with experience in data modelling, analyzing enterprise inventory technical reference data as well as granular telemetry and an appetite for architecting common data model that can be adopted by any product and solution.
This is a senior level, hands-on position responsible for a variety of data engineering activities.
Responsibilities:
- Create and maintain optimal data model and data pipeline for enterprise observability solution
- Provide SRE teams with automation opportunities based on data analysis
- Collaborate in building event co-relations and grouping policies based on event data and reducing tickets for application and infrastructure teams
- Architect and design a common data model, standard metadata taxonomy, data pipeline and curation of data for complex enterprise observability solutions covering infrastructure, system, and security logs and metrics
- Architect and implement measurement criteria in the data pipeline for completeness, timeliness, and accuracy of data
- Build processes and policies supporting data transformation, data structures, metadata, dependency, and data dictionary across the pipeline
- Utilize Citi’s data analytics tools like Grafana and become the data analytics SME to harness available technical asset inventory data to gain insights, improve data quality and increase self-service
Qualifications:
- 3+ years of experience as a Data Engineer
- Technical expertise required in big data and cloud technologies – Kafka, spark, Hadoop, Hive, HDFS, Cloudera.
- ERD – entity relationship diagrams
- Experience with Lucene or DSL based on JSON, Splunk SPL, relational SQL and NoSQL databases, including Oracle, Postgres, Mongo or Cassandra.
- Experience building and optimizing large data volumes in data pipelines and architecture and data sets using Kafka, ELK and Splunk
- Experience with Data connectivity methodologies such as APIs (Rest/Soap), ODBC/JDBC, HTTP web hooks, JSON, etc.
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores
- Experience supporting and working with cross-functional teams in a dynamic environment
- Excellent teamwork and proactive attitude
- Experience with Python, Java, Golang, or Scala
Good to have Skills:
- Experience with Grafana, ELK and Splunk will be a plus
- Experience in Machine learning, Data Scientist to predict event, fatal issues in applications and infrastructure will be a plus
Education:
- Bachelor’s degree/University degree or equivalent experience in Computer Science, Statistics, Information Systems, or another quantitative field
- Master’s degree is a plus