Principal Data Engineer - Platform & ML
Remote or Madrid (HQ)
Applications have closed
Cabify
Pide un viaje a través de la app o regístrate para conducir con nosotros. En tu ciudad, a tu ritmo y en cualquier lugar. Estamos listos para llevarte.About the position
This is an excellent opportunity to work in a company with a highly technological product that generates hundreds of thousands of events per second. A vast sea of data that not only stored and organized but also consumed to improve all aspects of the operation: pricing, dispatching, marketing, governance, and many others.
At Data Engineering, we operate dozens of services (Scala, Golang, Python), pipelines (Apache Beam, Airflow), and our in-house developed Machine Learning platform. We are a hands-on team: we manage our own infrastructure (GCE and AWS) and Kubernetes clusters (GKE).
Cabify is a global company with a very complex product, but at the same time with the perfect size to allow you to have a tangible impact on the final product. You will be able to build and improve the platform that provides trusted data at scale to the rest of the company. And you will do it as part of a team of experienced data engineers, helping each other grow technically and professionally.
You will:
- Design and develop end-to-end data solutions and modern data architectures for Cabify products and teams (streaming ingestion, data lake, data warehouse...).
- Evolve and maintain Lykeion, a Machine Learning platform developed along with the Data Science team, to take care of the whole lifecycle of models and features. It includes a feature store, which allows other groups inside Cabify to make better decisions based on data, and a prediction platform to serve ML models.
- Design and maintain complex APIs exposing data at scale, that helps other teams to make better decisions.
- Provide the company with data discoverability and governance.
- Collaborate with other technical teams to define, execute and release new services and features.
- Manage and evolve our infrastructure. Continuously identify, evaluate, and implement new tools and approaches to maximize development speed and cost efficiency.
- Extract data from internal and external sources to empower our Analytics team.
What we’re looking for:
We are looking for experienced data engineers with excellent know-how in large-scale distributed systems:
- 5+ years of tenure in coding and delivering complex data engineering projects.
- Fluency in different programming languages (we work with Python, Scala, and Go; you don’t need to master all three of them).
- Deep understanding of:
- Message delivery systems and streaming processing (Kafka, RabbitMQ, Akka streams, Apache Beam…)
- Data processing technology stacks and distributed processing (Hadoop, Spark, Apache Beam, Apache Flink...)
- Storage technologies (file-based, relational, columnar, document-based, key-value...)
- Orchestration tools such as Airflow, Luigi, or Dagster.
- Cloud infrastructures (GCP, AWS, Azure)
- Automation/IaC tools (Terraform, Puppet, Ansible…)
- MLOps
The good stuff:
We’re a company full of happy, motivated people and we never want that to change. Here are more reasons why it rocks to be part of our high-performance team.
- Salary conditions: 55k - 85k
- Very competitive stock options plan.
- Remote position, or on-site/hybrid position at our Madrid HQ.
- 22 vacation days + 2 extra days for the Christmas period.
- Recharge day: in addition to the just mentioned vacation days, every third Friday of each month is also a day off!
- Personal development programs based on our career paths, and a budget for training.
- Cabify staff free rides.
- Flexible compensation plan: Restaurant Tickets, Transport Tickets, healthcare and childcare.
- Regular team events.
- All the equipment you need (you only have to bring your talent).
- A pet room in the office, so you don’t have to leave your furry friend at home. And last but not least… free coffee and fruit!
Join us!
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow Ansible APIs Architecture AWS Azure Dagster Data warehouse Distributed Systems Engineering Flink GCP Golang Hadoop Kafka Kubernetes Machine Learning ML models MLOps Pipelines Python RabbitMQ Scala Spark Streaming Terraform
Perks/benefits: Career development Competitive pay Equity Flex vacation Gear Snacks / Drinks Team events
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Data Science Manager jobs
- Open MLOps Engineer jobs
- Open AI Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Sr Data Engineer jobs
- Open Data Engineer II jobs
- Open Data Manager jobs
- Open Principal Data Engineer jobs
- Open Data Analytics Engineer jobs
- Open Power BI Developer jobs
- Open Product Data Analyst jobs
- Open Junior Data Scientist jobs
- Open Senior Data Architect jobs
- Open Data Scientist II jobs
- Open Business Intelligence Developer jobs
- Open Sr. Data Scientist jobs
- Open Manager, Data Engineering jobs
- Open Data Analyst Intern jobs
- Open Data Quality Analyst jobs
- Open Big Data Engineer jobs
- Open Business Data Analyst jobs
- Open Principal Data Scientist jobs
- Open ETL Developer jobs
- Open Junior Data Engineer jobs
- Open Research Scientist jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open GCP-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Java-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open Consulting-related jobs
- Open TensorFlow-related jobs
- Open Snowflake-related jobs
- Open PhD-related jobs
- Open NLP-related jobs
- Open CI/CD-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Databricks-related jobs
- Open Hadoop-related jobs
- Open LLMs-related jobs
- Open Data warehouse-related jobs