Director/Senior Manager, Machine Learning Infrastructure & Engineering

San Francisco, CA

Applications have closed
Instacart logo

Posted 1 month ago


At Instacart, we use machine learning and internet scale data to elevate customer experience, improve efficiency and reduce cost in e-commerce, advertising, and fulfillment. For example, we build large, distributed machine learning models for personalization and recommendation. We're looking for experienced system and infrastructure engineers with background in building ML platforms and ML pipelines to join our fast-moving team.


  • There is tremendous opportunity in front of us, and joining now gives you a chance to grow your career and interests as we succeed.
  • You will advise the leadership about the proper machine learning infrastructure that supports the current and future needs of our business.
  • You will recruit and lead a team of software engineers to work on ML platforms and pipelines for an internet-scale company.
  • You will build scalable and easy-to-use machine learning workflows to improve the scalability and efficiency in launching machine learning solutions.
  • You will enable machine learning teams to perform scalable training, evaluation, and inference in the cloud and in client-side infrastructure.
  • You will enable software engineers across the company to use machine learning solutions in their work.
  • You will work closely with related teams, including Search, Ads, Personalization & Recommendation, Catalog, etc. You will have tremendous ownership and responsibility for managing things directly.


  • Background in Computer Science, Math, Statistics, or a related field.
  • Demonstrated leadership experience
  • 10+ years of industry experience building ML infrastructure and pipelines at scale.
  • Proficient in Python or C++. Experience writing and maintaining high-quality production code.
  • Knowledge of machine learning (particularly hyperparameter tuning and debugging).
  • Experience serving models using a variety of ML model frameworks like Tensorflow, PyTorch, Sci-kit Learn, etc.
  • Experience using AWS to build data-intensive infrastructure.
  • Experience building platforms around publicly available ML platforms like Sagemaker, Kubeflow, MLflow, Horovod, etc.
Job tags: AWS Engineering Horovod Machine Learning ML Python PyTorch TensorFlow