Site Reliability Engineer - Observability and Monitoring
Hartford-Farmington Ave Atrium
Full Time Senior-level / Expert USD 124K - 247K
CVS Health
America's leading health solutions company, CVS Health® provides advanced health care from pharmacy services and health plans to health and wellness.Bring your heart to CVS Health. Every one of us at CVS Health shares a single, clear purpose: Bringing our heart to every moment of your health. This purpose guides our commitment to deliver enhanced human-centric health care for a rapidly changing world. Anchored in our brand — with heart at its center — our purpose sends a personal message that how we deliver our services is just as important as what we deliver.
Our Heart At Work Behaviors™ support this purpose. We want everyone who works at CVS Health to feel empowered by the role they play in transforming our culture and accelerating our ability to innovate and deliver solutions to make health care more personal, convenient and affordable.
Position Summary:
Join Fortune 7 CVS Health as a Staff Software Development Engineer for SRE, Observability & Monitoring to lead our organization's efforts to develop and drive the strategic vision in establishing the SRE function for Aetna Technology division. You will play a critical role in shaping the direction of SRE practices within the company, working collaboratively with cross-functional teams to drive transformative outcomes, best practices and building a center of excellence. As needed, you will successfully manage both full-time employees and contracted resources to help execute IT project work within the organization at CVS Health. You will ensure that project initiatives and guidelines meet both internal and client requirements and are delivered on-time and on budget. In this role, you will lead a cross functional development team in the various phases of the SDLC for several projects that will range in size and complexity. You will also be responsible for engaging with various software vendors that support a series of applications that are critical to the overall Specialty Application. This role will also be responsible for the stability of the applications to maintain the corporate goals of system up-time.
Ideally looking for someone to work in hybrid model from CT office . Will consider remote candidates as well.
In this position you will:
Collaborate with cross-functional teams, including engineering and infrastructure teams to align SRE goals with business objectives.
Drive IT systems reliability for customer experience, and implement strategic initiatives such as AIOps, SRE, Automation, etc., to improve systems availability.
Lead the Enablement, Adoption and SRE Center of Excellence (COE) efforts and serve as a technical authority in SRE but not limited to system architecture, reliability, scalability, and performance optimization.
Lead the efforts to establish and maintain SRE best practices, Observability architecture, incident management, monitoring, alerting, and automation.
Promote a culture of automation within the SRE team and drive the implementation of automation tools, and processes to streamline operations and improve system reliability.
Work closely with infrastructure teams to plan and implement capacity upgrades and optimizations, and security teams to implement robust SRE security practices.
Monitor and analyze system performance, identifying bottlenecks and areas for improvement. Implement optimizations to enhance system performance and scalability.
Maintain comprehensive documentation of systems, procedures, and best practices. Foster a culture of knowledge sharing and mentorship within the SRE team.
Ensure that SRE practices adhere to industry standards and regulatory requirements.
Required Qualifications:
7+ years of experience with designing distributed applications using Microservices (Java or Python), Containerization technologies (Docker, Kubernetes) and private and/or public cloud such as Azure (preferred) or GCP.
5+ years of experience in a senior SRE/Observability role and well versed in SRE principles, SRE Metrics, Observability, Monitoring and Automation.
5+ years of experience with OpenTelemetry (OTEL) standards and at least one of the Observability & Monitoring LGTM platforms such as Loki, Grafana, Tempo, Managed Prometheus.
Preferred Qualifications:
Experience in implementing and improving SRE metrics in complex and distributed environments.
Experience in setting up LGTM tools such as Loki, Grafana, Tempo & Prometheus on Azure cloud.
Experience with APM and Logging tools such as AppDynamics & Splunk.
Experience with Anomaly detection in Observability Engineering leveraging AIOps & Machine Learning technologies.
Experience with DORA metrics and automating deployments for entire CI/CD pipeline using DevSecOps tools
Knowledge of IT Security and compliance, operations and network services, and application development.
Strong communication and collaboration skills, with the ability to work effectively with onshore/offshore teams.
Education:
Bachelor’s degree or, equivalent experience (HS diploma + 4 years relevant experience)
Pay Range
The typical pay range for this role is:
$124,372.50 - $247,200.00
This pay range represents the base hourly rate or base annual full-time salary for all positions in the job grade within which this position falls. The actual base salary offer will depend on a variety of factors including experience, education, geography and other relevant factors. This position is eligible for a CVS Health bonus, commission or short-term incentive program in addition to the base pay range listed above. This position also includes an award target in the company’s equity award program.
In addition to your compensation, enjoy the rewards of an organization that puts our heart into caring for our colleagues and our communities. The Company offers a full range of medical, dental, and vision benefits. Eligible employees may enroll in the Company’s 401(k) retirement savings plan, and an Employee Stock Purchase Plan is also available for eligible employees. The Company provides a fully-paid term life insurance plan to eligible employees, and short-term and long term disability benefits. CVS Health also offers numerous well-being programs, education assistance, free development courses, a CVS store discount, and discount programs with participating partners. As for time off, Company employees enjoy Paid Time Off (“PTO”) or vacation pay, as well as paid holidays throughout the calendar year. Number of paid holidays, sick time and other time off are provided consistent with relevant state law and Company policies.
For more detailed information on available benefits, please visit jobs.CVSHealth.com/benefits
Tags: Architecture Azure CI/CD CX Docker Engineering GCP Grafana Java Kubernetes Machine Learning Microservices Python SDLC Security Splunk
Perks/benefits: Career development Equity / stock options Health care Insurance Salary bonus
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Research Scientist jobs
- Open Data Science Manager jobs
- Open Junior Data Analyst jobs
- Open Business Data Analyst jobs
- Open Data Scientist II jobs
- Open Principal Data Scientist jobs
- Open Sr Data Engineer jobs
- Open BI Analyst jobs
- Open Business Intelligence Engineer jobs
- Open Sr. Data Scientist jobs
- Open Data Science Intern jobs
- Open Senior Business Intelligence Analyst jobs
- Open Software Engineer, Machine Learning jobs
- Open Lead Data Analyst jobs
- Open Azure Data Engineer jobs
- Open Junior Data Scientist jobs
- Open MLOps Engineer jobs
- Open Manager, Data Engineering jobs
- Open Marketing Data Analyst jobs
- Open Data Analytics Engineer jobs
- Open Data Engineer III jobs
- Open Data Engineering Manager jobs
- Open Junior Data Engineer jobs
- Open Product Data Analyst jobs
- Open Data Analyst II jobs
- Open Data quality-related jobs
- Open Power BI-related jobs
- Open Tableau-related jobs
- Open Excel-related jobs
- Open ML models-related jobs
- Open Data pipelines-related jobs
- Open APIs-related jobs
- Open PhD-related jobs
- Open PyTorch-related jobs
- Open Finance-related jobs
- Open LLMs-related jobs
- Open Deep Learning-related jobs
- Open TensorFlow-related jobs
- Open Data visualization-related jobs
- Open Consulting-related jobs
- Open Business Intelligence-related jobs
- Open Generative AI-related jobs
- Open CI/CD-related jobs
- Open NLP-related jobs
- Open Data governance-related jobs
- Open DevOps-related jobs
- Open Kubernetes-related jobs
- Open Git-related jobs
- Open Docker-related jobs
- Open Hadoop-related jobs