Data Engineer II

Seattle, Washington, USA

Full Time
Amazon.com logo
Amazon.com
Apply now Apply later

Posted 2 weeks ago

Do you want to build a cutting-edge highly scalable data infrastructure using AWS technologies and support new machine learning initiatives? Does the prospect of dealing with massive volumes of diverse data excite you? Do you want to work on performance challenges for providing the best recommendations in less than 140 milliseconds, given millions of customers and millions of products?
As a Data Engineer you will design and implement end-to-end data infrastructure to support new machine learning experiences and experiments. You should excel in the design, creation and management of extremely large datasets.

Responsibilities
In this role, you will have the opportunity to display your skills in the following areas:
· Lead, participate in gathering business requirements, analysis of source systems, define underlying data sources and transformation requirements, design suitable data modelling, and develop metadata for supporting ML use cases
· Design, implement and support an ML data infrastructure providing ad-hoc access to large datasets and computing power
· Managing AWS resources including EMR, Kinesis, DynamoDB etc
· Interface with other technology teams to extract, transform, and load data from a wide variety of data sources using Spark/EMR
· Explore and learn the latest AWS technologies to provide new capabilities and increase efficiency
· Help continually improve ongoing data transformation processes, automating or simplifying self-service support for customers

Basic Qualifications


· Computer science background required. A Bachelor’s degree or higher in computer science is required with a minimum of 5+ years of industry experience.
· 4+ years in working with large data sets and analyzing data to identify patterns.
· 5+ years in data warehousing projects with at least 4 years of full life cycle experience in implementation and support of DW Solutions.
· Must have extensive knowledge in SQL and ETL best practices.
· Query performance tuning skills
· Experience with Big Data Technologies (Hadoop, Hive, Hbase, Pig, Spark, etc.)
· Experience in functional programming languages (Scala, Python, etc.)
· Experience in leveraging distributed architecture when working with large datasets
· Excellent communication skills and the ability to work well in a team.
· Effective analytical, troubleshooting and problem-solving skills.

Preferred Qualifications

· Knowledge of software engineering best practices across the development lifecycle, including agile methodologies, coding standards, code reviews, source management, build processes, testing, and operations
· Hands-on experience with BI platform solutions.
· Experience in Building Real Time Data Pipelines
· Experience with Agile Development
· Knowledge/Experience working with ML systems
· Experience providing technical leadership and mentor other engineers for the best practices on the data engineering space

Job tags: AWS Big Data Data Warehousing Engineering ETL Hadoop Machine Learning ML Python Scala Spark SQL
Share this job: