Data Engineer - Content Platform
New York, NY
Applications have closed
Spotify
We grow and develop and make wonderful things happen together every day. It doesn't matter who you are, where you come from, what you look like, or what music you love. Join the band!Have you ever had a debate at a party about who originally wrote a song versus who covered it? Tried to find a sample you loved more than the song you heard it in? Wondered who was actually in the recording booth for your favorite song (on both sides of the glass)? Wonder what Britney Spears, Nsync, Pink, Katy Perry, Taylor Swift, and The Weeknd all have in common? (The same songwriter wrote Billboard number-one singles for all of them). Who gets paid every time we sing “Happy Birthday”?
Music attribution at scale is one of the phenomenal unsolved technical problems of the music industry, and we’re building cutting-edge technology to solve it. Our goal is to tackle this problem for the more than 60 million music tracks playable on Spotify, building a knowledge graph through innovative machine learning models, deep domain expertise, and close integration with human-in-the-loop processes across Spotify and the industry. Content Platform’s catalog data powers Spotify experiences from Artist pages in the app, search and recommendations, human playlist curation, Spotify for Artists, and our music industry-facing strategy.
Come join our team of versatile engineers that share a common interest in distributed systems, their scalability, and continued development! Our teams are composed of product, machine learning, data and backend engineers, and subject matter experts. You will build the data pipelines that power our application, scale highly distributed systems, and continuously improve our engineering practices. Above all, your work will impact the way the world experiences music!
What you'll do
- Build large-scale batch and real-time data pipelines with data processing frameworks like Scalding, Scio, Storm, Spark, and the Google Cloud Platform.
- Use best practices in continuous integration and delivery.
- Help drive optimization, testing, and tooling to improve data quality.
- Collaborate with other product managers, software engineers, ML experts, and stakeholders, taking learning and leadership opportunities that will arise every single day.
- Work in multi-functional agile teams to continuously experiment, iterate and deliver on new product objectives.
Who you are
- Have professional experience working in a product-driven environment.
- You know how to work with high-volume heterogeneous data, preferably with distributed systems such as Hadoop, BigTable, and Cassandra.
- You are knowledgeable about data modeling, data access, and data storage techniques.
- Know how to write distributed, high-volume services in Java or Scala.
- Have a deep understanding of system design, data structures, and algorithms.
- You are knowledgeable about data modeling, data access, and data storage techniques and are able to leverage these skills to make architectural decisions based on product opportunities.
- You care about agile software processes and iterative delivery, data-driven development, reliability, and responsible experimentation.
- Understand the value of collaboration within teams.
Where you'll be
- We are a distributed workforce enabling our band members to find a work mode best for them!
- Where in the world? For this role, it can be within the Americas region in which we have a work location.
- Prefer an office to work from home instead? Not a problem! We have plenty of options for your working preferences. Find more information about our Work From Anywhere options here.
- Working hours? We operate within the Eastern Standard time zone for collaboration.
Spotify transformed music listening forever when we launched in 2008. Our mission is to unlock the potential of human creativity by giving a million creative artists the opportunity to live off their art and billions of fans the chance to enjoy and be passionate about these creators. Everything we do is driven by our love for music and podcasting. Today, we are the world’s most popular audio streaming subscription service with a community of more than 381 million users.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Bigtable Cassandra Data pipelines Distributed Systems Engineering GCP Google Cloud Hadoop Machine Learning ML models Pipelines Scala Spark Streaming Testing
Perks/benefits: Career development Equity Team events
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Data Science Manager jobs
- Open MLOps Engineer jobs
- Open AI Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Sr Data Engineer jobs
- Open Data Engineer II jobs
- Open Data Manager jobs
- Open Principal Data Engineer jobs
- Open Power BI Developer jobs
- Open Data Analytics Engineer jobs
- Open Product Data Analyst jobs
- Open Junior Data Scientist jobs
- Open Senior Data Architect jobs
- Open Data Scientist II jobs
- Open Business Intelligence Developer jobs
- Open Sr. Data Scientist jobs
- Open Manager, Data Engineering jobs
- Open Data Analyst Intern jobs
- Open Data Quality Analyst jobs
- Open Big Data Engineer jobs
- Open Business Data Analyst jobs
- Open Principal Data Scientist jobs
- Open ETL Developer jobs
- Open Junior Data Engineer jobs
- Open Data Product Manager jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open GCP-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Java-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open Consulting-related jobs
- Open TensorFlow-related jobs
- Open Snowflake-related jobs
- Open PhD-related jobs
- Open NLP-related jobs
- Open CI/CD-related jobs
- Open Kubernetes-related jobs
- Open Airflow-related jobs
- Open Data governance-related jobs
- Open Databricks-related jobs
- Open Hadoop-related jobs
- Open LLMs-related jobs
- Open Data warehouse-related jobs