Sr. Staff ML Ops Technical Lead
Atlanta, US
Full Time Senior-level / Expert USD 161K - 197K
Dolby Laboratories
Dolby entwickelt Audio-, Bild- und Sprachtechnologien für Film, TV, Musik und Spiele. Erleben Sie alles mit beeindruckendem Klang und atemberaubendem Bild
Join the leader in entertainment innovation and help us design the future. At Dolby, science meets art, and high tech means more than computer code. As a member of the Dolby team, you’ll see and hear the results of your work everywhere, from movie theaters to smartphones. We continue to revolutionize how people create, deliver, and enjoy entertainment worldwide. To do that, we need the absolute best talent. We’re big enough to give you all the resources you need, and small enough so you can make a real difference and earn recognition for your work. We offer a collegial culture, challenging projects, and excellent compensation and benefits, not to mention a Flex Work approach that is truly flexible to support where, when, and how you do your best work.
Dolby’s consumer entertainment and cinema businesses are bringing Dolby’s breakthrough technologies, powering the world’s top movies, TV shows, music, games, and live sports to more places around the world across a wider range of consumer experiences and devices.
We are seeking a talented Staff Machine Learning Operations Engineer to join the Consumer Entertainment Group, to help bring the next generation of spectacular audio and video experiences to market. You will partner closely with research and development to establish machine-learning best practices and tools that maximize training and use of resources.
MLOpsEngineer-Responsibilities">Responsibilities
- Troubleshooting high-performance computing, storage and networks for machine-learning workloads.
- Collaborate with research, development and engineering to establish machine-learning and data management workflows and supporting tools and processes that maximize machine-learning activities and use of resources.
- Improve capabilities of data set exploration, transformation and overall data management of large to very large datasets.
- Partner with research and development to proactively iterate and fine-tune model training for best performance and efficient use of machine-learning resources.
- Collaborate with infrastructure teams physical compute, storage and network infrastructure experts to improve on-premise and cloud infrastructure.
- Improve use of cloud compute and storage for global research teams and manage within budget.
Education and Experience
- BS or MS degree in Computer Science or equivalent experience.
- 6+ years of professional practical hands-on experience in machine learning operations or equivalent.
- Comprehensive knowledge of AWS and infrastructure-as-code techniques.
- Advanced proficiency with Python, Terraform, Cloud Formation, Ansible, git and related.
- Experience leading a small team of machine-learning operations engineers with international distribution.
- Positive team leader with strong interpersonal skills to build team cohesion and rapport even from half a world away.
- Proficiency with machine learning and scaling workloads with both cloud and on-premise GPU server environments.
- Experience with managing and coordinating storage of large machine learning data sets.
- Proficiency in Kubernetes cluster design, deployment and management.
- Interest and understanding of industry trends in machine learning development techniques and tools and processes.
- Comprehensive knowledge of continuous integration and continuous release processes and tools
Recommended
- Exceptional understanding and practical experience in software and infrastructure configuration management with high-performance compute and storage and maximizing high-availability.
- Active collaborator to help build positive community with researchers, scientists and engineers around machine-learning operations and resources.
- AWS resource management and provisioning.
- Previous experience in system administration and infrastructure.
- Hands On Experience with:
- Conda, Python
- Ray cluster design, setup, provisioning and monitoring for high-availability.
- ML flow or similar
- High-performance file systems (lustre, beegeefs, Weka, or similar).
The Atlanta Area base salary range for this full-time position is $161,400-$197,200, which can vary if outside this location, plus bonus, benefits, and some roles may also include equity. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, competencies, experience, market demands, internal parity, and relevant education or training. Your recruiter can share more about the specific salary range and perks and benefits for your location during the hiring process.
Dolby will consider qualified applicants with criminal histories in a manner consistent with the requirements of San Francisco Police Code, Article 49, and Administrative Code, Article 12
Equal Employment Opportunity:
Dolby is proud to be an equal opportunity employer. Our success depends on the combined skills and talents of all our employees. We are committed to making employment decisions without regard to race, religious creed, color, age, sex, sexual orientation, gender identity, national origin, religion, marital status, family status, medical condition, disability, military service, pregnancy, childbirth and related medical conditions or any other classification protected by federal, state, and local laws and ordinances.
Tags: Ansible AWS Classification Computer Science Data management Engineering Git GPU Kubernetes Machine Learning Model training Python Research Terraform Weka
Perks/benefits: Career development Equity Salary bonus
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Lead Data Analyst jobs
- Open Data Science Manager jobs
- Open MLOps Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Engineer II jobs
- Open Data Manager jobs
- Open Principal Data Engineer jobs
- Open Power BI Developer jobs
- Open Data Scientist II jobs
- Open Junior Data Scientist jobs
- Open Business Data Analyst jobs
- Open Data Analytics Engineer jobs
- Open Sr Data Engineer jobs
- Open Product Data Analyst jobs
- Open Business Intelligence Developer jobs
- Open Data Analyst Intern jobs
- Open Sr. Data Scientist jobs
- Open Senior Data Architect jobs
- Open Big Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Principal Data Scientist jobs
- Open Data Quality Analyst jobs
- Open Research Scientist jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Data quality-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open Business Intelligence-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open PhD-related jobs
- Open Deep Learning-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open PyTorch-related jobs
- Open TensorFlow-related jobs
- Open APIs-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open LLMs-related jobs
- Open CI/CD-related jobs
- Open Generative AI-related jobs
- Open Kubernetes-related jobs
- Open Hadoop-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Docker-related jobs