Data Engineer II

Seattle, Washington, USA

Applications have closed

Amazon.com

Free shipping on millions of items. Get the best of Shopping and Entertainment with Prime. Enjoy low prices and great deals on the largest selection of everyday essentials and other products, including fashion, home, beauty, electronics, Alexa...

View company page

Cutting edge big data technology with Spark, Map Reduce, EMR, SageMaker, Distributed Pipeline? Check. Deep involvement with business strategy decisions? Check. Work across one of the world's largest and most complex data environments? Check.

Are you excited about the idea to work with big data resources and technologies? Are you up to the challenge of working with top economics and data science researchers to bring front tier economic theories to production? Are you ready to build data pipelines that inform billion dollar business decisions across Amazon? If your answer is yes, come join us!

We are a hyrbid research + engineering team that brings disruptive econometric models to life in the video, music, and advertising domains. As a Data Engineer, you will collaborate with research scientists, economists, and software engineers across the company to develop, test and deploy a wide range of econometric and ML models. You will face our ever-growing information challenges and provide solution to our analytics, and research science teams with the right data pipelines. You’ll build and support data gathering and validation systems at the forefront of the ML and Big Data revolution that are used for analytics, machine learning, and econometrics at scale.

You should be experienced in the architecture of data solutions for the Enterprise environment, such as ETL, RDBMS, Redshift, Spark, etc. You should excel in the design, creation, management, and business use of large (100 TB+) datasets. You should have excellent business and communication skills for working with scientists and project owners to build data sets that answer business questions. But above all, you should be passionate about working with huge data sets to answer hard business questions and drive disruptive change.

In this opportunity, you will be working in one of the world's largest and most complex data environments. You will partner with PhD scientists, BIEs, and Software Engineers to create and provide the analytic technologies that give our customers timely, flexible, and structured access to data and the products of disruptive ML models.

Responsibilities:
· Create a productionized data platform that serves as an input to machine learning / econometric models
· Provide technical and thought leadership for Data Engineering and Business Intelligence
· Establish key relationships which span Amazon business units and Business Intelligence teams.
· Create a Data Governance strategy for mitigating disparate data sources where applicable.
· Implement standardized, automated operational and quality control processes to deliver accurate and timely data and reporting to meet or exceed SLAs.
· Create and execute vision to develop a series of analytic dashboards and decision-support tools, encompassing key forecasting metrics to be tracked on a daily/weekly/monthly basis.
· Assist business leaders, scientists (economists and ML scientists), business intelligence engineers, and software developers in creating and implementing business requirement documents to drive projects, working backward from customer needs.

Basic Qualifications


· Bachelors or Masters Degree in Computer Science, Information Systems or related field.
· 3+ years experience in Data Engineering or Business Intelligence roles working with ETL, Data Modeling, and Data Architecture
· ETL design and SQL skills, knowledge of industry best practices, and a deep understanding how data is extracted, transformed, scrubbed and loaded in a large Data Warehouse environment.
· Experience with Big Data technologies such as Pig/Hive/Spark
· Proficiency in at least one scripting language - Python, ruby, linux shell, or similar




Preferred Qualifications

· Experienced with setting up end to end Data pipelines in an Enterprise environment
· Proficient in performance optimizing Spark queries and jobs
· Experiences with Amazon Web Services tools such as Redshift, EMR, SageMaker or other similar platforms
· Experience leveraging Python, R or Matlab to manipulate data and set up automated processes as per business requirements
· Excellent communication skills with both technical and non technical users


Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.

Tags: Big Data Business Intelligence Computer Science Data pipelines Econometrics Economics Engineering ETL Excel Linux Machine Learning Map Reduce Matlab ML models PhD Pipelines Python R RDBMS Redshift Research Ruby SageMaker Spark SQL

Perks/benefits: Flex hours

Region: North America
Country: United States
Job stats:  8  1  0
Category: Engineering Jobs

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.