MLOps Engineer- MTS / SMTS / LMTS

India - Bengaluru

Salesforce

Bieten Sie die beste Customer Experience mit einem einzigen CRM-Tool für Sales, Kundenservice, Marketing, Commerce & IT. Jetzt 30 Tage testen!

View company page

To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.

Job Category

Software Engineering

Job Details

About Salesforce

We’re Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. And, we empower you to be a Trailblazer, too — driving your performance and career growth, charting new paths, and improving the state of the world. If you believe in business as the greatest platform for change and in companies doing well and doing good – you’ve come to the right place.

Einstein products & platform democratize AI and transform the way our Salesforce Ohana builds trusted machine learning and AI products - in days instead of months. It augments the Salesforce Platform with the ability to easily create, deploy, and manage Generative AI and Predictive AI applications across all clouds. We achieve this vision by providing unified, configuration-driven, and fully orchestrated machine learning APIs, customer-facing declarative interfaces and various microservices for the entire machine learning lifecycle including Data, Training, Predictions/scoring, Orchestration, Model Management, Model Storage, Experimentation, etc.

We are already producing over a billion predictions per day, Training 1000s of models per day along with 10s of different Large Language models, serving thousands of customers. We are enabling customers' usage of leading large language models (LLMs), both internally and externally developed, in order to allow them to leverage it in their Salesforce use cases. Along with the power of the Data Cloud, this platform provides customers an unparalleled advantage for quickly integrating AI in their applications and processes.

We are looking for passionate Machine Learning SRE to help us take us to the next level, and support and collaborate with MLE and Data Science team to build a platform that scales to hundreds of thousands of customers, and hundreds of billions of predictions per day and works on bleeding edge technologies on model training, model inferencing and Generative AI. In this position, you will play a crucial role in bridging the gap between machine learning development and operational reliability. You will be responsible for ensuring the seamless integration of machine learning models into our production environment while maintaining high availability, scalability, and reliability. This role requires a deep understanding of both machine learning concepts and modern DevOps/SRE practices.
 

Your Impact:

  • Lead the charge on taking our core platform tools to the next level in terms of engineering maturity and architecture.
  • Refine and develop new workflows, tools, and automation.
  • Build tools to monitor machine learning pipelines and services, data pipeline performance, data quality and models in production.
  • Establish best practices with coding standards, workflows, tools, and product automation.
  • Review and maintain existing tool-set and codebase (pipelines, models, algorithms); continue to improve existing tools and build new ones.
  • Scale the operations of the MLE team by building automation and libraries.



The ideal candidate will be:

  • Technical - We are looking for passionate and code geek developers who analyze business problems and evolve technical solutions in the most optimal and simple ways. Sometimes engineers wear multiple hats to drive their projects end-to-end, thinking holistically and compare from available set of technologies to drive best decisions technically.
  • A Leader - You are a natural leader, who can mentor and coach engineers on the team to be able to handle bigger challenges, find fulfillment in their work, and execute on the product growth goals through collaboration to do the best work of their lives.
  • Experienced - We will need you to bring that experience. We want the best people who spend large portions of their time thinking about how to design large-scale distributed Machine Learning services.
  • Team Player - You will drive collaboration, efficiency and communication by liaising with your peers, leadership, product and program management and cross-teams. You will support/seek timely help with your peers, communicate risks and mitigation plans with leadership, and communicate closely with product managers to iteratively build AI Platform services that cater to our users and business use cases.

Responsibilities:

  • Working with Sagemaker, Tensorflow, Pytorch, Triton, Spark, or equivalent large-scale distributed Machine Learning technologies on a modern containerized deployment stack using Kubernetes, Spinnaker, and other technologies.
  • Partner with Product Managers, Architects, Machine Learning Engineers and Software Engineers to understand platform requirements, and help translate requirements to working software.
  • Own the ML DevOps for fully orchestrated machine learning APIs for the Einstein Platform.
  • Contribute to the long-range plan, and help drive the efficient platform architectures for machine learning.
  • Participate in the team’s on-call rotation to address complex problems in real-time and keep services operational and highly available.
  • Create and enforce processes that ensure quality of work, and drive engineering excellence.
  • Exhibit a customer-first mentality while making decisions, and be responsible and accountable for the output of the team.
  • Work collaboratively in geographically distributed teams in North America, EMEA and APAC

Core Qualifications:
 

A related technical degree required

  • 4+ years of industry experience and a passion for crafting, analyzing and deploying machine learning-based solutions
  • Experience working as part of a team with mature data science products
  • Consistent record in building and establishing comprehensive monitoring, logging, and alerting solutions to proactively identify and address performance bottlenecks and potential issues, ensuring the continuous availability and reliability of our machine learning systems.
  • Experience deploying, monitoring and maintaining data science products in cloud environments such as AWS or Microsoft Azure
  • Good understanding of Machine Learning methods, including ML project lifecycle and associated challenges at each stage of development.
  • Proficient at writing good quality, well-documented and tested, scalable code - Python preferred. Experience with tools like mlFlow, Airflow, Docker and Cloud Platforms such as AWS/GCP is ideal
  • Strong grasp of DevOps best practices, including continuous integration, continuous deployment, and infrastructure automation, supported by practical experience in implementing and managing CI/CD pipelines.
  • Act as a first responder to production incidents, utilizing your troubleshooting skills and expertise to swiftly diagnose and resolve issues, minimizing downtime and mitigating potential impact on our operations.
  • Strong communication skills and ability to interface well with other engineers, data scientists and product managers
  • Passion, curiosity, solutions focus and independence

Preferred Qualifications

  • Experience in DevOps specific to Salesforce Einstein and Data Cloud platform, including deployment and maintenance of Salesforce applications and integration with machine learning models.
  • Thorough understanding of networking concepts and protocols, with the ability to design and troubleshoot complex network architectures.

Accommodations

If you require assistance due to a disability applying for open positions please submit a request via this Accommodations Request Form.

Posting Statement

At Salesforce we believe that the business of business is to improve the state of our world. Each of us has a responsibility to drive Equality in our communities and workplaces. We are committed to creating a workforce that reflects society through inclusive programs and initiatives such as equal pay, employee resource groups, inclusive benefits, and more. Learn more about Equality at www.equality.com and explore our company benefits at www.salesforcebenefits.com.

Salesforce is an Equal Employment Opportunity and Affirmative Action Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status. Salesforce does not accept unsolicited headhunter and agency resumes. Salesforce will not pay any third-party agency or company that does not have a signed agreement with Salesforce.

Salesforce welcomes all.

Apply now Apply later
  • Share this job via
  • or

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Airflow APIs Architecture AWS Azure CI/CD Data quality DevOps Docker Engineering GCP Generative AI Kubernetes LLMs Machine Learning Microservices MLFlow ML models MLOps Model training Pipelines Python PyTorch SageMaker Salesforce Spark TensorFlow

Perks/benefits: Career development Startup environment

Region: Asia/Pacific
Country: India
Job stats:  7  2  0

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.