Technical Lead (AWS Glue, Pyspark, SQL) RR/261/2024 || 8 - 12 Years

Bengaluru, India

Job Title: Data Engineer - PySpark, SQL, Athena, and AWS Glue

Job Description:

We are looking for a talented and experienced Data Engineer proficient in PySpark, SQL, Athena, and AWS Glue to join our dynamic team. As a Data Engineer, you will be responsible for designing, developing, and maintaining robust data pipelines and analytics solutions on the AWS cloud platform. You will work closely with cross-functional teams to understand data requirements, implement efficient data processing workflows, and ensure data quality and integrity.

Responsibilities:

  1. Design, develop, and maintain ETL (Extract, Transform, Load) processes using PySpark, SQL, and AWS Glue to ingest, transform, and load large volumes of structured and unstructured data.
  2. Implement scalable and reliable data pipelines to automate data ingestion and processing tasks, ensuring high performance and efficiency.
  3. Collaborate with data analysts and scientists to understand data needs and implement solutions for advanced analytics and reporting.
  4. Optimize data pipelines for performance and cost efficiency, leveraging AWS services such as Athena for serverless querying of data stored in Amazon S3.
  5. Develop and maintain data catalogs and metadata repositories to facilitate data discovery, lineage tracking, and governance.
  6. Monitor and troubleshoot data pipelines to identify and resolve issues related to data quality, reliability, and performance.
  7. Work closely with infrastructure and operations teams to deploy and manage data processing workflows in a cloud environment.
  8. Stay updated with the latest developments in big data technologies, best practices, and AWS services to continuously improve data engineering capabilities.

Requirements:

  1. Bachelor's degree in Computer Science, Engineering, or a related field. Advanced degree preferred.
  2. Proven experience in building data pipelines and analytics solutions using PySpark, SQL, Athena, and AWS Glue.
  3. Strong proficiency in SQL and relational database concepts, with hands-on experience in query optimization and performance tuning.
  4. Experience working with large-scale datasets and distributed computing frameworks.
  5. Solid understanding of data warehousing concepts and best practices.
  6. Hands-on experience with AWS services such as Amazon S3, AWS Glue, Athena, AWS Lambda, and AWS IAM.
  7. Strong problem-solving skills and attention to detail.
  8. Excellent communication and collaboration skills, with the ability to work effectively in a cross-functional team environment.
  9. AWS certifications such as AWS Certified Data Analytics - Specialty or AWS Certified Big Data - Specialty are a plus.
  10. Prior experience with other big data technologies such as Apache Spark, Hadoop, or Kafka is desirable.
  11. Familiarity with agile development methodologies and DevOps practices is a plus.

If you are passionate about leveraging cutting-edge technologies to unlock the value of data and drive actionable insights, we would love to hear from you! Join us in our mission to revolutionize data engineering and analytics in the cloud.

Apply now Apply later
  • Share this job via
  • or

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Agile Athena AWS AWS Glue Big Data Computer Science Data Analytics Data pipelines Data quality Data Warehousing DevOps Engineering ETL Hadoop Kafka Lambda Pipelines PySpark RDBMS Spark SQL Unstructured data

Region: Asia/Pacific
Country: India
Job stats:  2  0  0
Category: Leadership Jobs

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.