Senior Data Engineer - Personalization
Boston, MA
Applications have closed
Spotify
We grow and develop and make wonderful things happen together every day. It doesn't matter who you are, where you come from, what you look like, or what music you love. Join the band!The ProgRes squad is looking for a strong Data Engineer with backend experience who can help us build solutions to limit the spread of misinformation, reduce algorithmic bias and ensure fair representation across our recommendations products. Teams across the Personalization team depend on ProgRes tools to make sure that we recommend safe and high-quality content to our users.
Our team hosts services that provide recommendation systems with near real-time updates of blocked or harmful talk audio content. We work heavily with GCP products that include dataflow, bigtable, and bigquery. Some notable systems that we own include the recommendations filtering service (rf-v2) and the Recommendation Eligibility for Talk Audio (RETA) dataset.
We are a diverse team filled with people who are passionate about algorithmic responsibility and the safety of Spotify’s recommendation systems, who also love to have fun. Come join the tribe!
What You'll Do
- Build large-scale batch data pipelines with frameworks such as Scio, Storm, or Spark, and the Google Cloud Platform.
- Deliver scalable, testable, maintainable, and high-quality code.
- Demonstrate standard methodologies in continuous integration and delivery.
- Help drive optimization, testing, and tooling to improve data quality.
- Work within a multi-functional agile team to continuously experiment, iterate, and deliver on new product objectives.
- Be a technical leader on the ProgRes team and within Spotify in general.
- Facilitate and drive collaboration with engineers, product managers, and partners to solve exciting data problems critical to the safety of our listeners.
- Share knowledge, promote best practices, and generally make your team the best version of itself through mentorship and constructive accountability.
Who You Are
- You know how to work with high-volume heterogeneous data, preferably with distributed systems such as Hadoop, BigTable, and Cassandra. Ideally you have also built innovative solutions that address current limitations of these technologies.
- You have used one or more high-level JVM-based processing framework such as Beam, Crunch, Scalding, Storm, Spark, or another SQL-like abstraction.
- You have a deep understanding of data modeling, access, and storage, as well as caching, replication, and optimization techniques.
- You have 5+ years of experience in designing and building distributed, high volume services in JavaYou have worked in a cloud-native (GCP preferred) development and production environment where all CI/CD take place.
- You care about agile software processes, data development, reliability, and responsible experimentation.
- You understand the value of collaboration within teams.
- You are comfortable with communication, being able to work independently while always sharing context with your team members.
Where You'll Be
- We are a distributed workforce enabling our band members to find a work mode that is best for them!
- Where in the world? For this role, it can be within the Americas or European regions in which we have a work location and is within working hours.
- Working hours? We operate between the Eastern and Central European time zones for collaboration.
- Prefer an office to work from home instead? Not a problem! We have plenty of options for your working preferences. Find more information about our Work From Anywhere options here.
Spotify transformed music listening forever when we launched in 2008. Our mission is to unlock the potential of human creativity by giving a million creative artists the opportunity to live off their art and billions of fans the chance to enjoy and be passionate about these creators. Everything we do is driven by our love for music and podcasting. Today, we are the world’s most popular audio streaming subscription service.
Global COVID and Vaccination DisclosureSpotify is committed to safety and well-being of our employees, vendors and clients. We are following regional guidelines mandating vaccination and testing requirements, including those requiring vaccinations and testing for in-person roles and event attendance. For the US, we have mandated that all employees and contractors be fully vaccinated in order to work in our offices and externally with any third-parties. For all other locations, we strongly encourage our employees to get vaccinated and also follow local COVID and safety protocols.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile BigQuery Bigtable Cassandra CI/CD Dataflow Data pipelines Distributed Systems GCP Google Cloud Hadoop Pipelines Spark SQL Streaming Testing
Perks/benefits: Equity
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Marketing Data Analyst jobs
- Open MLOps Engineer jobs
- Open Junior Data Scientist jobs
- Open AI Engineer jobs
- Open Data Engineer II jobs
- Open Senior Data Architect jobs
- Open Sr Data Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Analytics Engineer jobs
- Open Power BI Developer jobs
- Open Manager, Data Engineering jobs
- Open Product Data Analyst jobs
- Open Principal Data Engineer jobs
- Open Business Data Analyst jobs
- Open Data Quality Analyst jobs
- Open Data Manager jobs
- Open Sr. Data Scientist jobs
- Open Data Scientist II jobs
- Open Big Data Engineer jobs
- Open Business Intelligence Developer jobs
- Open Data Analyst Intern jobs
- Open Principal Data Scientist jobs
- Open ETL Developer jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Business Intelligence-related jobs
- Open Data quality-related jobs
- Open Privacy-related jobs
- Open Data management-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open Deep Learning-related jobs
- Open APIs-related jobs
- Open PyTorch-related jobs
- Open PhD-related jobs
- Open Consulting-related jobs
- Open TensorFlow-related jobs
- Open Snowflake-related jobs
- Open NLP-related jobs
- Open Data governance-related jobs
- Open Data warehouse-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open Databricks-related jobs
- Open LLMs-related jobs
- Open DevOps-related jobs
- Open CI/CD-related jobs