Principal Data Engineer - Platform & ML

Remote or Madrid (HQ)

Applications have closed

Cabify

Pide un viaje a través de la app o regístrate para conducir con nosotros. En tu ciudad, a tu ritmo y en cualquier lugar. Estamos listos para llevarte.

View company page

About the position

This is an excellent opportunity to work in a company with a highly technological product that generates hundreds of thousands of events per second. A vast sea of data that not only stored and organized but also consumed to improve all aspects of the operation: pricing, dispatching, marketing, governance, and many others.

At Data Engineering, we operate dozens of services (Scala, Golang, Python), pipelines (Apache Beam, Airflow), and our in-house developed Machine Learning platform. We are a hands-on team: we manage our own infrastructure (GCE and AWS) and Kubernetes clusters (GKE). 

Cabify is a global company with a very complex product, but at the same time with the perfect size to allow you to have a tangible impact on the final product. You will be able to build and improve the platform that provides trusted data at scale to the rest of the company. And you will do it as part of a team of experienced data engineers, helping each other grow technically and professionally. 

 

You will:

  • Design and develop end-to-end data solutions and modern data architectures for Cabify products and teams (streaming ingestion, data lake, data warehouse...).
  • Evolve and maintain Lykeion, a Machine Learning platform developed along with the Data Science team, to take care of the whole lifecycle of models and features. It includes a feature store, which allows other groups inside Cabify to make better decisions based on data, and a prediction platform to serve ML models.
  • Design and maintain complex APIs exposing data at scale, that helps other teams to make better decisions.
  • Provide the company with data discoverability and governance.
  • Collaborate with other technical teams to define, execute and release new services and features.
  • Manage and evolve our infrastructure. Continuously identify, evaluate, and implement new tools and approaches to maximize development speed and cost efficiency.
  • Extract data from internal and external sources to empower our Analytics team.

 

What we’re looking for:

We are looking for experienced data engineers with excellent know-how in large-scale distributed systems:

  • 5+ years of tenure in coding and delivering complex data engineering projects.
  • Fluency in different programming languages (we work with Python, Scala, and Go; you don’t need to master all three of them).
  • Deep understanding of:
    • Message delivery systems and streaming processing (Kafka, RabbitMQ, Akka streams, Apache Beam…)
    • Data processing technology stacks and distributed processing (Hadoop, Spark, Apache Beam, Apache Flink...)
    • Storage technologies (file-based, relational, columnar, document-based, key-value...)
    • Orchestration tools such as Airflow, Luigi, or Dagster.
    • Cloud infrastructures (GCP, AWS, Azure)
    • Automation/IaC tools (Terraform, Puppet, Ansible…)
    • MLOps

 

The good stuff:

We’re a company full of happy, motivated people and we never want that to change. Here are more reasons why it rocks to be part of our high-performance team.

  • Salary conditions: 55k - 85k
  • Very competitive stock options plan.
  • Remote position, or on-site/hybrid position at our Madrid HQ.
  • 22 vacation days + 2 extra days for the Christmas period.
  • Recharge day: in addition to the just mentioned vacation days, every third Friday of each month is also a day off!
  • Personal development programs based on our career paths, and a budget for training.
  • Cabify staff free rides.
  • Flexible compensation plan: Restaurant Tickets, Transport Tickets, healthcare and childcare.
  • Regular team events.
  • All the equipment you need (you only have to bring your talent).
  • A pet room in the office, so you don’t have to leave your furry friend at home. And last but not least… free coffee and fruit!

Join us!

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Airflow Ansible APIs Architecture AWS Azure Dagster Data warehouse Distributed Systems Engineering Flink GCP Golang Hadoop Kafka Kubernetes Machine Learning ML models MLOps Pipelines Python RabbitMQ Scala Spark Streaming Terraform

Perks/benefits: Career development Competitive pay Equity Flex vacation Gear Snacks / Drinks Team events

Regions: Remote/Anywhere Europe
Country: Spain
Job stats:  48  10  0

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.