Site Reliability Engineer - Big Data (On Premise)
Bengaluru
PhonePe
PhonePe is a Digital Wallet & Online Payment App that allows you to make instant Money Transfers with UPI. Recharge Mobile, DTH, Pay Utility Bills, Buy/Invest in Gold, Mutual Funds, Insurance & much more.About PhonePe Group:
PhonePe is India’s leading digital payments company with 50 crore (500 Million) registered users and 3.7 crore (37 Million) merchants covering over 99% of the postal codes across India. On the back of its leadership in digital payments, PhonePe has expanded into financial services (Insurance, Mutual Funds, Stock Broking, and Lending) as well as adjacent tech-enabled businesses such as Pincode for hyperlocal shopping and Indus App Store which is India's first localized App Store. The PhonePe Group is a portfolio of businesses aligned with the company's vision to offer every Indian an equal opportunity to accelerate their progress by unlocking the flow of money and access to services.
Culture
At PhonePe, we take extra care to make sure you give your best at work, Everyday! And creating the right environment for you is just one of the things we do. We empower people and trust them to do the right thing. Here, you own your work from start to finish, right from day one. Being enthusiastic about tech is a big part of being at PhonePe. If you like building technology that impacts millions, ideating with some of the best minds in the country and executing on your dreams with purpose and speed, join us!
Job Overview:
As a Site Reliability Engineer (SRE) specializing in Data Platform On Premise, you will play a critical role in deployment, ensuring the reliability, scalability, and performance of our Cloudera Data Platform (CDP) infrastructure. You will collaborate closely with cross-functional teams to design, implement, and maintain robust systems that support our data-driven initiatives. The ideal candidate will have a deep understanding of Cloudera Data Platform, strong troubleshooting skills, and a proactive mindset towards automation and optimization. You will play a pivotal role in ensuring the smooth functioning, operation, performance and security of large high density Cloudera-based infrastructure.
Key Responsibilities:
- Implementation of Cloudera Data Platform: Lead the implementation process of Cloudera Data Platform on-premises, including planning, installation, configuration, and integration with existing systems.
- Infrastructure Management: Manage and maintain the Cloudera-based infrastructure, ensuring optimal performance, high availability, and scalability. This includes monitoring system health, troubleshooting issues, and performing routine maintenance tasks.
- Data Security and Compliance: Implement and enforce security best practices to safeguard data integrity and confidentiality within the Cloudera environment. Ensure compliance with relevant regulations and standards (e.g., GDPR, HIPAA, DPR).
- Performance Optimization: Continuously optimize the Cloudera infrastructure to enhance performance, efficiency, and cost-effectiveness. Identify and resolve bottlenecks, tune configurations, and implement best practices for resource utilization.
- Capacity Planning: Monitor resource utilization trends and plan for future capacity needs. Proactively identify potential capacity constraints and propose solutions to address them.
- Backup and Disaster Recovery: Implement robust backup and disaster recovery strategies to ensure data protection and business continuity. Test and maintain backup and recovery procedures regularly.
- Patches & Upgrades: Routinely apply recommended patches and perform rolling upgrades of the platform in accordance with the advisory from Cloudera, InfoSec and Compliance.
- Documentation and Knowledge Sharing: Create comprehensive documentation for configurations, processes, and procedures related to the Cloudera Data Platform. Share knowledge and best practices with team members to foster continuous learning and improvement.
- Collaboration and Communication: Collaborate effectively with cross-functional teams including data engineers, developers, and IT operations personnel. Communicate project status, issues, and resolutions clearly and promptly.
Qualifications:
- Bachelor's degree in Computer Science, Engineering, or related field.
- Proficiency in Linux system administration, shell scripting, and networking concepts.
- 5+ years of experience in managing Big Data infrastructure.
- Strong understanding of distributed computing principles and experience with Hadoop ecosystem technologies (HDFS, MapReduce, YARN, Hive, Spark, etc.).
- Hands-on experience with configuration management tools (e.g., Salt,Ansible, Puppet, Chef).
- Strong scripting skills (e.g., Python, Bash) for automation and troubleshooting.
- Experience with monitoring and logging solutions (e.g., Prometheus, Grafana, ELK stack).
- Knowledge of networking principles and protocols (TCP/IP, UDP, DNS, DHCP, etc.).
- Experience with managing *nix based machines and strong working knowledge of quintessential Unix programs and tools (e.g. Ubuntu, Fedora, Redhat, etc.)
- Excellent communication skills and the ability to collaborate effectively with cross-functional teams.
- Excellent analytical, problem-solving, and troubleshooting skills..
- Proven ability to work well under pressure and manage multiple priorities simultaneously.
Good To Have:
- Cloudera Certified Administrator (CCA) or Cloudera Certified Professional (CCP) certification preferred.
- Minimum 5 years of experience in managing and administering medium/large hadoop based environments (>100 machines), including Cloudera Data Platform (CDP) experience is highly desirable.
- Familiarity with Open Data Lake components such as Ozone, Iceberg, Spark, Flink, etc.
- Familiarity with containerization and orchestration technologies (e.g. Docker, Kubernetes, OpenShift) is a plus
PhonePe Full Time Employee Benefits (Not applicable for Intern or Contract Roles)
- Insurance Benefits - Medical Insurance, Critical Illness Insurance, Accidental Insurance, Life Insurance
- Wellness Program - Employee Assistance Program, Onsite Medical Center, Emergency Support System
- Parental Support - Maternity Benefit, Paternity Benefit Program, Adoption Assistance Program, Day-care Support Program
- Mobility Benefits - Relocation benefits, Transfer Support Policy, Travel Policy
- Retirement Benefits - Employee PF Contribution, Flexible PF Contribution, Gratuity, NPS, Leave Encashment
- Other Benefits - Higher Education Assistance, Car Lease, Salary Advance Policy
Working at PhonePe is a rewarding experience! Great people, a work environment that thrives on creativity, the opportunity to take on roles beyond a defined job description are just some of the reasons you should work with us. Read more about PhonePe on our blog.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Ansible Big Data Computer Science Docker ELK Engineering Flink Grafana Hadoop HDFS Kubernetes Linux Puppet Python Security Shell scripting Spark
Perks/benefits: Career development Flex hours Health care Medical leave Parental leave Relocation support Startup environment Wellness
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open MLOps Engineer jobs
- Open Lead Data Analyst jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Manager jobs
- Open Data Science Manager jobs
- Open Principal Data Engineer jobs
- Open Data Engineer II jobs
- Open Sr Data Engineer jobs
- Open Power BI Developer jobs
- Open Product Data Analyst jobs
- Open Business Intelligence Developer jobs
- Open Data Scientist II jobs
- Open Junior Data Scientist jobs
- Open Data Analytics Engineer jobs
- Open Business Data Analyst jobs
- Open Sr. Data Scientist jobs
- Open Senior Data Architect jobs
- Open Data Analyst Intern jobs
- Open Big Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Junior Data Engineer jobs
- Open Data Quality Analyst jobs
- Open Data Product Manager jobs
- Open Principal Data Scientist jobs
- Open Azure Data Engineer jobs
- Open GCP-related jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Data visualization-related jobs
- Open Finance-related jobs
- Open Deep Learning-related jobs
- Open PhD-related jobs
- Open APIs-related jobs
- Open TensorFlow-related jobs
- Open PyTorch-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open CI/CD-related jobs
- Open LLMs-related jobs
- Open Kubernetes-related jobs
- Open Generative AI-related jobs
- Open Data governance-related jobs
- Open Hadoop-related jobs
- Open Airflow-related jobs
- Open Docker-related jobs