Tech Lead, Data Science
The ML Engineering team within the Marketing Data Science team at Wayfair develops scalable data processing platforms and deploys hundreds of machine learning models that power algorithmic decision-making across dozens of marketing channels and customer touchpoints.
Data is at the heart of everything we do and ML data engineering is crucial to our ability to scale as we train and deploy the next generation of ML products at Wayfair that power the way millions of customers interact with us. You will be processing petabytes of un/structured first party and third party data and building scalable modeling pipelines that help us evaluate the long-term value of the millions of products we offer and of actions the business can take. The tools we’ve built thus far and the new capabilities we have on our roadmap aim to redefine how we think about label and feature generation, model development, deployment, and monitoring.
Above all, you’ll get to work on problems that are both intellectually-challenging and drive real, measurable impact, first and foremost, for our customers - and as a result for Wayfair at large. To get a better sense of the type of projects we actually work on, check out our Data Science & Machine Learning blog posts here!
What You'll Do
- Build highly scalable distributed data processing platforms that evaluate the long-term incremental value of our millions of offerings and of actions that the business can take.
- Collaborate with other data scientists to build high quality ML models that can robustly scale up to large volumes in production.
- Partner closely with various business & engineering teams to drive the integration of our model outputs & algorithmic decision-making systems into existing production systems
- You’ll be a builder of tools, software, and microservices that enhance or streamline various steps or challenges within the data science workflow & our tech stack
- Extend existing ML libraries and frameworks for scalable model training & deployment
- Be obsessed with the customer and maintain a customer-centric lens in how we frame, approach, and ultimately solve every problem we work on.
- Building highly scalable distributed data processing platforms that evaluate the long-term incremental value of various customer actions (such as downloading our app, or signing up for our credit card, etc.)
- Build low dimensional representation of customers to enable better personalization & frictionless product discovery for any given customer. This problem is both highly impactful and intellectually-challenging given we offer millions of stylistically unique products on our platform!
What You'll Need
- 3+ years of experience working as a professional software developer, ML Engineer, or Data Scientist (with a strong engineering skills & interest in software development)
- BSc, MS, or PhD in quantitative field (e.g. mathematics, computer science, engineering, operations research, physics, economics, neuroscience, etc.)
- Proficient in OOP programming; experience in Python, Java, etc. and exposure to some of Python’s ML ecosystem (numpy, panda, sklear, tensorflow, etc.)
- Proficient in large-scale data processing and distributed systems (HDFS, Spark, etc.)
- Experience with CI/CD tools (ex Jenkins or equivalent), version control (Git), orchestration/DAGs (Airflow, Luigi, Kubeflow, or equivalent)
- Exposure to machine-learning model lifecycle: training, evaluation, serving
- Interest in deploying machine-learning models as scalable services
- Desire to work in a collaborative environment focusing on continuous learning; writing blog posts, participating in tech talks, conducting code reviews, etc.
- You don’t have to be an expert in all or any of the above areas but we need someone with a passion for learning and growing as a software developer and ML engineer
It's a bonus to have:
- Interest in causal inference techniques nice to have
- Experience with any of: R, Docker, GCP, Kubernetes, Snowflake nice to have