Data Engineer II
Posted 1 month ago
Who are we?
At Careem, our mission is to simplify and improve the lives of people and create an awesome organisation that inspires. With this vast mission statement, we started by improving transportation and delivery in the region, and now we are expanding into Payment and we’re launching a super-app, hosting multiple Careem and 3rd-party apps, to further simplify and improve people’s everyday life.
We built the first multi-billion dollar tech startup in the MENAP region. The first line of code was written in Pakistan and we built on it further in Dubai and Berlin. We operate in 100+ cities across 11 countries. We joined Uber officially in early 2020. We grew and attracted top global talent and grew a culture for bold ambitions, shooting for the moon, innovation with tight constraints, and being Careem/gracious.
About The Role:
The Careem Big Data Platform team’s mission is to provide a platform to abstract big data complexities and enable fast, reliable and secure access to data. As the leader of this team, you will be at the forefront of fulfilling this mission. You will be working with and leading the top talent of the region, leveraging modern big data tools and techniques to solve the region’s day to day problems, on top of our own in-house data platform, serving users in real-time.
Main activities and responsibilities:
- Define the architecture, scope and deliver various Big Data solutions.
- Support other teams by providing guidance on data modeling, data usage, processing and how they can best leverage the platform
- Build scalable data pipelines to ingest data from a variety of data sources, identify critical data elements and define data quality rules.
- Leverage Spark/Hadoop ecosystem knowledge to design and develop capabilities to deliver innovative and improved data solutions.
- Provide insights on area of improvements including Data Governance, best practices, large scale processing
- Support the bug fixing and performance analysis along the data pipeline
- 3+ years of experience as software engineer, with strong skills in at least one programming language is mandatory, preferably Scala or Java or Python
- 1+ year of experience with Spark on Hadoop, EMR etc
- Experience working with real time data processing using Kafka, Spark Streaming or similar technology
- Experience with distributed systems and design/implementation for reliability, availability, scalability and performance
- Proven experience with AWS technologies like S3, EMR, Cloudformation.
- Creative and innovative approach to problem-solving
Good to Have:
- Experience with CICD using Jenkins, Terraform or other related technologies
- Familiarity with containerized platform like Docker and Kubernetes
- Experience working with Hive, Presto or other querying frameworks.