Technical Lead (AWS Glue, Pyspark, SQL) RR/261/2024 || 8 - 12 Years
Bengaluru, India
Job Title: Data Engineer - PySpark, SQL, Athena, and AWS Glue
Job Description:
We are looking for a talented and experienced Data Engineer proficient in PySpark, SQL, Athena, and AWS Glue to join our dynamic team. As a Data Engineer, you will be responsible for designing, developing, and maintaining robust data pipelines and analytics solutions on the AWS cloud platform. You will work closely with cross-functional teams to understand data requirements, implement efficient data processing workflows, and ensure data quality and integrity.
Responsibilities:
- Design, develop, and maintain ETL (Extract, Transform, Load) processes using PySpark, SQL, and AWS Glue to ingest, transform, and load large volumes of structured and unstructured data.
- Implement scalable and reliable data pipelines to automate data ingestion and processing tasks, ensuring high performance and efficiency.
- Collaborate with data analysts and scientists to understand data needs and implement solutions for advanced analytics and reporting.
- Optimize data pipelines for performance and cost efficiency, leveraging AWS services such as Athena for serverless querying of data stored in Amazon S3.
- Develop and maintain data catalogs and metadata repositories to facilitate data discovery, lineage tracking, and governance.
- Monitor and troubleshoot data pipelines to identify and resolve issues related to data quality, reliability, and performance.
- Work closely with infrastructure and operations teams to deploy and manage data processing workflows in a cloud environment.
- Stay updated with the latest developments in big data technologies, best practices, and AWS services to continuously improve data engineering capabilities.
Requirements:
- Bachelor's degree in Computer Science, Engineering, or a related field. Advanced degree preferred.
- Proven experience in building data pipelines and analytics solutions using PySpark, SQL, Athena, and AWS Glue.
- Strong proficiency in SQL and relational database concepts, with hands-on experience in query optimization and performance tuning.
- Experience working with large-scale datasets and distributed computing frameworks.
- Solid understanding of data warehousing concepts and best practices.
- Hands-on experience with AWS services such as Amazon S3, AWS Glue, Athena, AWS Lambda, and AWS IAM.
- Strong problem-solving skills and attention to detail.
- Excellent communication and collaboration skills, with the ability to work effectively in a cross-functional team environment.
- AWS certifications such as AWS Certified Data Analytics - Specialty or AWS Certified Big Data - Specialty are a plus.
- Prior experience with other big data technologies such as Apache Spark, Hadoop, or Kafka is desirable.
- Familiarity with agile development methodologies and DevOps practices is a plus.
If you are passionate about leveraging cutting-edge technologies to unlock the value of data and drive actionable insights, we would love to hear from you! Join us in our mission to revolutionize data engineering and analytics in the cloud.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Athena AWS AWS Glue Big Data Computer Science Data Analytics Data pipelines Data quality Data Warehousing DevOps Engineering ETL Hadoop Kafka Lambda Pipelines PySpark RDBMS Spark SQL Unstructured data
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Marketing Data Analyst jobs
- Open Data Science Manager jobs
- Open Lead Data Analyst jobs
- Open Data Engineer II jobs
- Open Senior Business Intelligence Analyst jobs
- Open Principal Data Engineer jobs
- Open MLOps Engineer jobs
- Open Data Analytics Engineer jobs
- Open Power BI Developer jobs
- Open Data Scientist II jobs
- Open Junior Data Scientist jobs
- Open Business Intelligence Developer jobs
- Open Product Data Analyst jobs
- Open Business Data Analyst jobs
- Open Sr Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Senior Data Architect jobs
- Open Sr. Data Scientist jobs
- Open Big Data Engineer jobs
- Open Data Quality Analyst jobs
- Open Research Scientist jobs
- Open Azure Data Engineer jobs
- Open Principal Data Scientist jobs
- Open Manager, Data Engineering jobs
- Open Data Product Manager jobs
- Open GCP-related jobs
- Open Data quality-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Business Intelligence-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Deep Learning-related jobs
- Open PhD-related jobs
- Open Data visualization-related jobs
- Open Finance-related jobs
- Open NLP-related jobs
- Open PyTorch-related jobs
- Open TensorFlow-related jobs
- Open APIs-related jobs
- Open Consulting-related jobs
- Open LLMs-related jobs
- Open CI/CD-related jobs
- Open Generative AI-related jobs
- Open Snowflake-related jobs
- Open Hadoop-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open Databricks-related jobs
- Open DevOps-related jobs