Data Engineer

Irvine, California, USA

Applications have closed

Amazon.com

Free shipping on millions of items. Get the best of Shopping and Entertainment with Prime. Enjoy low prices and great deals on the largest selection of everyday essentials and other products, including fashion, home, beauty, electronics, Alexa...

View company page

Job summary
Amazon’s Onsite, Offsite, and Perception Measurement (O2PM) team (a part of Customer Behavior Analytics) is hiring a talented, self-directed Data Engineer to support the rapid growth of our Marketing Measurement solutions. You will design, develop, implement, test, document, and operate large-scale, high-volume, high-performance data structures and pieplines for our internal customers. Implement data structures using best practices in data modeling and ETL/ELT processes. Gather business and functional requirements and translate these requirements into robust, scalable, operable solutions that work well within the overall data architecture. Analyze source data systems and drive best practices in source teams. Participate in the full development life cycle, end-to-end, from design, implementation and testing, to documentation, delivery, support, and maintenance. Produce comprehensive, usable dataset documentation and metadata. Set up and maintain a compliant system of credential authentication. Evaluate and make decisions around dataset implementations and new or existing software products and tools. Educate and mentor scientists in best data querying practices to improve efficiencies and accelerate.

The ideal candidate relishes working with a science team to develop scalable products that ingest large volumes of data to understand customer preferences, enjoys working independently and the challenge of highly complex technical contexts, and, above all else, is passionate about data and analytics. They are expert with data modeling, ETL design and business intelligence tools and passionately partners with the business to identify strategic opportunities where improvements in data infrastructure creates out-sized business impact. They are a self-starters, comfortable with ambiguity, able to think big (while paying careful attention to detail) and enjoy working in a fast-paced team. The ideal candidate needs to possess exceptional technical expertise in large scale data warehouse, lakes and BI systems with hands-on knowledge on SQL, Distributed/MPP data storage, and AWS services (S3, Redshift, EMR, RDS).

Key job responsibilities
  • Design, implement, and support a platform providing ad hoc access to large datasets
  • Interface with other technology teams to extract, transform, and load data from a wide variety of data sources using SQL
  • Implement data structures using best practices in data modeling, ETL/ELT processes, and SQL, and Redshift
  • Build robust and scalable data integration (ETL) pipelines using SQL, Python, Spark and Scala
  • Build and deliver high quality datasets to support business analysis and customer reporting needs
  • Interface with internal customers and scientists, gathering requirements and delivering complete data structures

About the team
The Customer Behavior Analytics (CBA) organization owns Amazon’s insights pipeline, from data collection to deep analytics. We aspire to be the place where Amazon teams come for answers, a trusted source for data and insights that empower our systems and business leaders to make better decisions. Our outputs shape Amazon product and marketing teams’ decisions and thus how Amazon customers see, use, and value their experience.

Basic Qualifications


  • Degree in Computer Science, Engineering, Mathematics, or a related field or 4+ years industry experience
  • 3+ years of experience as a Data Engineer
  • Experience with data modeling, data warehousing, and building ETL pipelines
  • Experience writing complex, highly-optimized SQL queries across large data sets
  • Experience with AWS technologies such as Redshift, EMR, and S3
  • Coding proficiency in at least one modern programming language (e.g. Python, Spark, Scala etc.)

Preferred Qualifications

  • Experience building/operating large scale pipelines using distributed systems for data extraction, ingestion, and processing of large data sets
  • Experience building data products incrementally and integrating and managing datasets from multiple sources
  • Experience working with a science team and/or familiarity with Machine Learning
  • Familiarity with Spark and/or Scala
  • Experience leading large-scale data warehousing and analytics projects, including using AWS technologies – Redshift, S3, EMR, Glue etc.
  • Experience providing technical leadership and mentoring scientists and other engineers for the best practices on the data engineering space


Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Architecture AWS Business Intelligence Computer Science Data warehouse Data Warehousing Distributed Systems ELT Engineering ETL Machine Learning Mathematics MPP Pipelines Python Redshift Scala Spark SQL Testing

Perks/benefits: Career development

Region: North America
Country: United States
Job stats:  4  1  0
Category: Engineering Jobs

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.