Senior Machine Learning Engineer, Distributed Systems

Buenos Aires

ASAPP logo
Apply now Apply later

Posted 4 weeks ago

At ASAPP, our mission is to solve complex and challenging problems by building transformative machine learning-powered products. We leverage artificial intelligence to address significant challenges that share three common characteristics: huge economic scale, systemic inefficiencies, and tremendous amounts of data. Although our talented teams that drive our product innovation and development are headquartered in New York City, San Francisco, Mountain View, and Buenos Aires, we are open to candidates from anywhere in Argentina for this role. 
We are looking for a new member to join our Machine Learning team as a Senior Machine Learning Engineer focused on Distributed Systems.  You should have the passion to tackle tough problems by bringing your expertise to ASAPP to help us solve cutting edge machine learning and natural language processing (NLP) challenges. As a part of our team, you will design, develop, and deploy large-scale machine learning training infrastructure and processes to simultaneously train and update hundreds or thousands of cutting-edge models simultaneously. You’ll work closely with our researchers, site reliability engineers, data engineers, and fellow machine learning engineers to build systems capable of training machine learning models at a scale and quality required to serve our clients, the world’s largest companies. You’ll be faced with new technical challenges and things to learn about ML and software engineering on a daily basis.

What you'll do

  • Help us turn research into meaningful ML products that help tens or hundreds of millions of users
  • Design, build, and maintain model training systems that leverage and enable cutting-edge developments in training machine learning models at scale
  • Actively work on creating and improving tools to parallelize model training, unifying dataset creation and accuracy measurements across experiments
  • Collaborate with and mentor engineers and researchers to help them maximize the speed and efficiency in model research and training
  • Actively follow advancements in AI and ML, and participate in discussions about them with the ML and research teams

What you'll need

  • Minimum of 5 years experience working on distributed computing projects
  • Desire to learn new things, work closely with peers from different teams, and help others
  • Production experience with modern cloud computing management (AWS, Kubernetes, Docker, etc.)
  • Experience working with distributed computing technologies (for example: Hadoop, Spark, Airflow, Ray)
  • Experience with at least one major programming language (Python, Go, Java, etc.)

What we'd like to see

  • Production experience with Machine Learning and/or Natural Language Processing
  • Proficiency in deep-learning frameworks like TensorFlow or PyTorch  and/or experimentation and training frameworks, such as PyTorch-Lightning
  • Experience in big-data, ETL, or large-scale data science
  • Knowledge of additional languages such as Python, Go, Scala, Javascript, or Typescript are not necessary but will help you work in cross-functional teams
  • Be passionate about something we don’t already have expertise in!


  • Competitive compensation
  • Stock options
  • OSDE 410 for the family group
  • Wellness perks
  • Macbook equipment
  • 3 weeks vacation
  • Training and development
  • English lessons
ASAPP is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, disability, age, or veteran status. If you have a disability and need assistance with our employment application process, please email us at to obtain assistance. #LI-MT1
Job tags: AI Airflow AWS Distributed Systems Engineering ETL Hadoop Java JavaScript Kubernetes Machine Learning ML NLP Python PyTorch Research Scala Spark TensorFlow