Principal Machine Learning Infrastructure Engineer

US, CA, San Jose, Rio Robles

Applications have closed

Analog Devices, Inc. (NASDAQ: ADI) is a global semiconductor leader that bridges the physical and digital worlds to enable breakthroughs at the Intelligent Edge. ADI combines analog, digital, and software technologies into solutions that help drive advancements in digitized factories, mobility, and digital healthcare, combat climate change, and reliably connect humans and the world. With revenue of more than $12 billion in FY22 and approximately 25,000 people globally working alongside 125,000 global customers, ADI ensures today’s innovators stay Ahead of What’s Possible.

ADI’s Central AI team develops next-generation AI technology that transforms our understanding of the physical world.  We develop solutions at multiple tech stack layers, from AI-enabled software applications to deeply embedded AI algorithms.  Our mission is to build the Intelligent Edge, where AI transforms how we solve challenging problems by combining deep application knowledge, close customer relationships, extraordinary data, advanced circuits, and breakthrough algorithms.

We're looking for engineers who bring expertise across the AI space, including ML platform design, cloud hosted AI services, foundational AI models, LLMs, Edge AI, and cutting-edge AI research; our list of breakthrough products and technologies is growing at a rapid pace.  Central AI is critical to ADI’s future and presents opportunities to select from a variety of project areas as you and our AI-driven business grow. Finally, we need our team to be versatile, willing to take risks, able lead projects quickly, and be enthusiastic about new technologies and solutions.

Location: San Jose, CA or Boston, MA

Responsibilities 

  • As a ML Infrastructure Engineer, you help build, deliver, and optimize software systems to enable AI/ML solutions.

  • Design and implement machine learning systems and workflows to support real-time training, testing, and deployment of AI models.

  • Design and implement distributed cloud GPU training approaches for deep learning model training and evaluation.

  • Build end-to-end machine learning pipelines and integrate them into product and business system workflows.

  • Architect and own the build-release continuous integration processes of our deep learning software components that are built, tested, and released on various DL frameworks (Tensorflow, PyTorch, JAX, etc.)

  • Propose, implement, and deploy efficient and scalable DevOps solutions to allow our fast-growing team to release software more frequently while maintaining high-quality and top performance.

  • Automate away recurring tasks (DL algorithm accuracy and performance regression detection, designing and developing new quality control measures, e.g., code analysis) while employing and advancing best practices.

Qualifications

  • 5+ years of experience in software engineering, including experience with distributed systems real-time streaming.

  • Degree in Computer Science or a related technical field. 

  • Strong system level programming skills (Python, shell scripting, etc.) and familiarity with Linux system administration.

  • Hands-on experience with infrastructure engineering, modern DevOps processes, CI/CD, and GitHub.

  • Experience with ML frameworks (Pytorch, Tensorflow, etc.) and model distribution frameworks (TorchServe, etc.).

  • Experience with developing, implementing, and optimizing container orchestration systems, such as Kubernetes.

  • Ability to work with and manage cloud data technologies, such as Kafka, ElasticSearch, Terraform, AirFlow, or Dagster.

  • Excellent debugging and optimization skills.

  • Experience working on software teams and willingness to work in a fast-paced environment.

For positions requiring access to technical data, Analog Devices, Inc. may have to obtain export  licensing approval from the U.S. Department of Commerce - Bureau of Industry and Security and/or the U.S. Department of State - Directorate of Defense Trade Controls.  As such, applicants for this position – except US Citizens, US Permanent Residents, and protected individuals as defined by 8 U.S.C. 1324b(a)(3) – may have to go through an export licensing review process.

Analog Devices is an equal opportunity employer. We foster a culture where everyone has an opportunity to succeed regardless of their race, color, religion, age, ancestry, national origin, social or ethnic origin, sex, sexual orientation, gender, gender identity, gender expression, marital status, pregnancy, parental status, disability, medical condition, genetic information, military or veteran status, union membership, and political affiliation, or any other legally protected group.

EEO is the Law: Notice of Applicant Rights Under the Law.

Job Req Type: Experienced

          

Required Travel: Yes, 10% of the time

          

Shift Type: 1st Shift/Days

The wage range for a new hire into this position is $0 to $0.
  • Actual wage offered may vary depending on geography, experience, education, training, external market data, internal equity, or other bona fide factors.

  • This position qualifies for a discretionary performance-based bonus which is based on personal and company factors.

  • This position includes medical, vision and dental coverage, 401k, paid vacation, holidays, and sick time, and other benefits.

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Airflow CI/CD Computer Science Dagster Deep Learning DevOps Distributed Systems Elasticsearch Engineering GitHub GPU JAX Kafka Kubernetes Linux LLMs Machine Learning ML infrastructure Model training Pipelines Python PyTorch Research Security Shell scripting Streaming TensorFlow Terraform Testing

Perks/benefits: Career development Flex vacation Health care Salary bonus

Region: North America
Country: United States
Job stats:  11  2  0

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.