Senior Data Engineer (Python, Spark, GCP) - REMOTE

Remote - United States

Applications have closed

Wavicle Data Solutions

Top-rated data analytics firm offering data management consulting, data integration services, and more.

View company page

Wavicle Data Solutions leverages Cloud, Data & Analytics technologies to deliver complex business & digital transformation solutions to our clients. As a Minority Business Enterprise (MBE) with a 40%+ women workforce, Wavicle fosters a diverse & equitable environment where innovative professionals come together as a team and enable our clients to realize their goals in their transformation journey. Our team members collaborate by infusing their creative problem solving skills, agile working & tech know-how to drive value for our clients.

At Wavicle, a Top Workplace award winner, you’ll find a challenging and rewarding work environment where our 350+ team members based in US, India Canada work from 42 cities in a remote/hybrid, digitally connected way. We offer a competitive benefits package that includes: healthcare, retirement, life insurance, short/long-term disability, unlimited paid time off, short-term incentive plans (annual bonus) and long-term incentive plans.


Watch here to learn Why Wavicle


We are looking for a Senior Data Engineer who will be responsible for designing and building optimized data pipelines, in an on-prem or cloud environment, for the purpose of driving analytic insights.

What You Will Get To Do:

  • Create the conceptual, logical and physical data models.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of sources like Hadoop, Spark, Cloud Functions, etc.
  • Lead and/or mentor a small team of data engineers.
  • Design, develop, test, deploy, maintain and improve data integration pipeline.
  • Develop pipeline objects using Apache Spark / Pyspark / Python or Scala.
  • Design and develop data pipeline architectures using Hadoop, Spark and related GCP Services.
  • Load and performance test data pipelines built using the above-mentioned technologies.
  • Communicate effectively with client leadership and business stakeholders.
  • Participate in proposal and/or SOW development.

Requirements

  • Bachelor or Master’s degree in Computer Science, Engineering, Information Systems or relevant degree is required.
  • 5+ years of professional work experience designing and implementing data pipelines in on-prem and cloud environments is required.
  • 3+ years of experience with Cloud platforms (GCP Preferred), and Python programming and frameworks (e.g., Django, Flask, Bottle) is required.
  • 5+ years of working with one or more databases like Snowflake, AWS Redshift, Oracle, SQL Server, Teradata, Netezza, Hadoop, Mongo DB or Cassandra is required.
  • Expert level knowledge of using SQL to write complex, highly-optimized queries across large volumes of data is required.
  • 3+ years of hands-on programming experience using Scala, Python, R, or Java is required.
  • 2+ years of professional work experience on ETL pipeline implementation using GCP services such as Cloud Storage, for Firebase, Firebase Cloud Functions, Dataproc, Firebase Hosting, Firebase Realtime Database, Firebase Cloud Messaging, Pub/Sub, etc. is required.
  • 2+ years of professional work experience using real-time streaming systems (Kafka/Kafka Connect, Spark, Flink or Pub/Sub) is required.
  • Knowledge or experience in architectural best practices in building data lakes is required.
  • Strong problem solving and troubleshooting skills with the ability to exercise mature judgement.
  • Ability to work independently, and provide guidance to junior data engineers.
  • Ability to build and maintain strong customer relationships.
  • Candidates can reside anywhere in the U.S. and work remotely; however, must be flexible to travel up to 25% to client location to build and maintain strong client relationships.


Equal Opportunity Employer

Wavicle is an Equal Opportunity Employer and committed to creating an inclusive environment for all employees. We welcome and encourage diversity in the workplace regardless of race, color, religion, national origin, gender, pregnancy, sexual orientation, gender identity, age, physical or mental disability, genetic information or veteran status.

Benefits

  • Health Care Plan (Medical, Dental & Vision)
  • Retirement Plan (401k, IRA)
  • Life Insurance (Basic, Voluntary & AD&D)
  • Unlimited Paid Time Off (Vacation, Sick & Public Holidays)
  • Short Term & Long Term Disability
  • Training & Development
  • Work From Home
  • Bonus Program

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Agile AWS Cassandra Computer Science Data pipelines Dataproc Django Engineering ETL Flask Flink GCP Hadoop Kafka Oracle Pipelines PySpark Python R Redshift Scala Snowflake Spark SQL Streaming Teradata

Perks/benefits: 401(k) matching Flex hours Flex vacation Health care Insurance Salary bonus Unlimited paid time off

Regions: Remote/Anywhere North America
Country: United States
Job stats:  4  1  0
Category: Engineering Jobs

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.