Sr. ML Infrastructure Engineer

Remote

Applications have closed

Runway

Runway is an applied AI research company shaping the next era of art, entertainment and human creativity.

View company page

At Runway, we believe everyone has a story to tell. Our mission is to make professional video and content creation accessible to all. We are taking recent advancements in computer graphics, the web, and machine learning to push the boundaries of creativity and in turn, lower the barriers of content creation; unfastening a new wave of storytelling 🚀

Over the last three years, we’ve raised funding from top-tier investors including Coatue, Lux, and Amplify, all with a team small enough to fit at one (growing) table. Our team consists of creative, open minded, caring and entrepreneurial individuals from all walks of life. We aspire to build incredible things which starts with building an incredible team, so we’d love to hear from you! 😄

About the role 🎉

We’re looking for a Senior ML Infrastructure Engineer to help us scale our infrastructure and tooling behind the development, testing, and deployment of our machine learning based products. The ideal candidate for this role has experience provisioning large compute clusters for machine learning workflows, has experience supporting teams to create best practices for reliability and scalability, and thrives in fast-paced, high-ownership environment.

A peek at our technical stack 🔍

The rich UI of our video editing and collaboration tools is powered by Typescript and React/Redux, while the real time compositing and graphics engine behind our interactive preview runs on WebGL2 and WebAssembly. Our video streaming backend components are written in Python, use a lot of FFmpeg/libav and HLS for on-the-fly transcoding, PyTorch and TorchScript for ML inference, and are deployed as containerized services on Kubernetes. Our API endpoints for real-time collaboration and media asset management are written in Typescript and node.js and are deployed as serverless functions on AWS Lambda.

What you’ll do 🎨

  • Manage large compute clusters for ML training, inference, and development
  • Create tooling and infrastructure that abstract compute and storage in ML development workflows
  • Build automation and CI/CD pipelines for developing and deploying new machine learning models

What you’ll need 💻

  • 3+ years of experience in a DevOps or Infrastructure Engineer role building machine learning infrastructure and working with large GPU clusters
  • Knowledge of cloud providers such as AWS, GCP, or Azure, infrastructure-as-code frameworks such as Terraform, observability tools such as Grafana
  • Interest and experience supporting engineering teams in creating robust processes for automation, reliability, and instrumentation
  • Strong communication, collaboration, and documentation skills

Working at Runway 🥳

We are a small and growing team of artists, engineers, researchers, and dreamers working together to reimagine creativity. And we’re building a unique team of talented individuals from diverse backgrounds. We believe that this will allow us to continue to up-level each other, our company, and our product. We’re looking for people that will add to our culture, not just fit in.

We’re committed to creating a space where our employees can bring their full selves to work and have equal opportunity to succeed. So regardless of race, gender identity or expression, sexual orientation, religion, origin, ability, age, veteran status, if joining this mission speaks to you, we encourage you to apply!

 

Keep exploring Runway:

Our Behaviors and Company Mission 

Runway raises $35M Series B

Building Impossible Things: A glimpse into the future of Runway

Runway Research

Runway Graphics

Runway Blog

Tags: APIs AWS Azure CI/CD Content creation DevOps Engineering GCP GPU Grafana Kubernetes Lambda Machine Learning ML infrastructure ML models Node.js Pipelines Python PyTorch React Research Streaming Terraform Testing TypeScript

Region: Remote/Anywhere
Job stats:  24  6  0

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.