Aumni - Site Reliability Engineer III - MLOPS
Salt Lake City, UT, United States
JPMorgan Chase & Co.
There’s nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems.
As a Site Reliability Engineer III at JPMorgan Chase within the Digital Private Markets /Aumni (A JP Morgan Chase Company), you will solve complex and broad business problems with simple and straightforward solutions. Through code and cloud infrastructure, you will configure, maintain, monitor, and optimize applications and their associated infrastructure to independently decompose and iteratively improve on existing solutions. You are a significant contributor to your team by sharing your knowledge of end-to-end operations, availability, reliability, and scalability of your application or platform. As MLops Engineer, you will solve complex and broad business problems with simple and straightforward solutions. Through code and cloud infrastructure, you will configure, maintain, monitor, and optimize the models produced by our data science teams and their associated. You are a significant contributor to your team by sharing your knowledge of end-to-end operations, availability, reliability, and scalability in the AI/ML space.
Job responsibilities
- Guides and assists others in the areas of designing and deploying new AI/ML models in the cloud, gaining consensus from peers where appropriate
- Designs and implements automated continuous integration and continuous delivery pipelines for the Data Science teams to develop and train AI/ML models
- Writes and deploys infrastructure as code for the models and pipelines you support
- Collaborates with technical experts, key stakeholders, and team members to resolve complex technical problems
- Understands the importance of monitoring and observability in the AI/ML space – i.e. service level indicators and utilizes service level objectives
- Proactively resolve issues before they impact internal and external stakeholders of deployed models
- Supports the adoption of MLops best practices within your team
Required qualifications, capabilities, and skills
- Formal training or certification on site reliability engineering concepts and 3+ years applied experience
- Understanding of MLops culture and principles and familiarity with how to implement associated concepts at scale
- Domain knowledge of machine learning applications and technical processes within the AWS ecosystem
- Experience with infrastructure as code tooling such as Terraform, Cloudformation
- Experience with container and container orchestration such as ECS, Kubernetes, and Docker
- Knowledge of continuous integration and continuous delivery tools like Jenkins, GitLab, or Github Actions
- Proficiency in the following programming languages: Python, Bash
- Hands-on knowledge of Linux and networking internals
- Understanding of the different roles served by data engineers, data scientists, machine learning engineers, and system architects, and how MLops contributes to each of these workstreams
- Ability to identify new technologies and relevant solutions to ensure design constraints are met by the Data Science and Machine Learning teams
- Experience with Model training and deployment pipelines, managing scoring endpoints
- Familiarity with observability concepts and telemetry collection using tools such as Datadog, Grafana, Prometheus, Splunk, and others
- Understanding of data engineering platforms such as Databricks or Snowflake, and machine learning platforms such as AWS Sagemaker
- Comfortable troubleshooting common containerization technologies and issues
- Ability to proactively recognize road blocks and demonstrates interest in learning technology that facilitates innovation
We offer a competitive total rewards package including base salary determined based on the role, experience, skill set, and location. For those in eligible roles, we offer discretionary incentive compensation which may be awarded in recognition of firm performance and individual achievements and contributions. We also offer a range of benefits and programs to meet employee needs, based on eligibility. These benefits include comprehensive health care coverage, on-site health and wellness centers, a retirement savings plan, backup childcare, tuition reimbursement, mental health support, financial coaching and more. Additional details about total compensation and benefits will be provided during the hiring process.
We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants’ and employees’ religious practices and beliefs, as well as mental health or physical disability needs. Visit our FAQs for more information about requesting an accommodation.
JPMorgan Chase is an Equal Opportunity Employer, including Disability/Veterans
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: AWS Banking CloudFormation Databricks Docker ECS Engineering GitHub GitLab Grafana Kubernetes Linux Machine Learning ML models MLOps Model training Pipelines Python SageMaker Snowflake Splunk Terraform
Perks/benefits: Career development Competitive pay Health care Wellness
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Data Manager jobs
- Open Data Science Manager jobs
- Open Lead Data Analyst jobs
- Open MLOps Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Principal Data Engineer jobs
- Open Data Engineer II jobs
- Open Power BI Developer jobs
- Open Sr Data Engineer jobs
- Open Data Analytics Engineer jobs
- Open Product Data Analyst jobs
- Open Data Scientist II jobs
- Open Business Intelligence Developer jobs
- Open Junior Data Scientist jobs
- Open Business Data Analyst jobs
- Open Sr. Data Scientist jobs
- Open Senior Data Architect jobs
- Open Data Analyst Intern jobs
- Open Big Data Engineer jobs
- Open Principal Data Scientist jobs
- Open Junior Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Data Quality Analyst jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Data quality-related jobs
- Open GCP-related jobs
- Open Business Intelligence-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Data visualization-related jobs
- Open Finance-related jobs
- Open Deep Learning-related jobs
- Open PhD-related jobs
- Open PyTorch-related jobs
- Open APIs-related jobs
- Open TensorFlow-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open CI/CD-related jobs
- Open LLMs-related jobs
- Open Generative AI-related jobs
- Open Kubernetes-related jobs
- Open Hadoop-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Docker-related jobs