Sr. Data Engineer, Core AI
Seattle, Washington, USA
Amazon.com
Free shipping on millions of items. Get the best of Shopping and Entertainment with Prime. Enjoy low prices and great deals on the largest selection of everyday essentials and other products, including fashion, home, beauty, electronics, Alexa...Over the past 20 years, Amazon has reinvented on behalf of customers and has become the largest internet retailer in the world. Its still Day-1 for Amazon and we are continuing our journey to innovate and delight customers with our services and products. The CoreAI team has a mission is to leverage latest science methodologies and invent new science solutions for solving some of the most ambiguous business problems at Amazon. We partner with leaders across Amazon’s team for solving difficult and challenging problems in multiple domains like retail, supply chain, search, pricing, cloud computing. We collaboratively design science and engineering solutions that directly translate into software or next generation products.
Our Data engineering team enables transformation of structured and unstructured data from 100+ upstream systems into our central multi-petabyte data lake with curated scientific metrics. We leverage advanced Big data technologies and distributed computing for our ETL and ELT workflows with highly scalable data infrastructure.
As part of the team you will get the opportunity to
- Advance our Big Data stack and cutting edge technologies or create new, better, smarter data solutions.
- You will be working with a team of the best Scholars, Economists, Machine Learning Scientists at Amazon.
- You will be collaborating with software engineers, scientists and product leaders from multiple domains like ML, robotics, computer vision (CV), natural language processing (NLP), distributed systems and econometrics.
- Collaborate on State of the Art Science model development and production integrations.
Key job responsibilities
- Design, implement, and enhance an analytical data infrastructure providing ad hoc access to large datasets and computing power
- Interface with other technology teams to extract, transform, and load data from a wide variety of data sources using Spark, SQL and AWS big data technologies.
- Interact with scientists and engineers to gather requirements and structure solutions. Drive adoption of new analytic technologies and solutions and promote industry standard best practices.
- Develop Analytics applications automation using modern scripting languages (Python, R, Scala, etc) for ML/science models.
- Implementing mechanisms for data governance, privacy and protection as per defined data security policies
- Help continually improve ongoing scientific data access and EDA(exploratory data analysis) processes, simplifying self-service support for users.
- Provide technical leadership, lead data engineering initiatives and build end-to-end data solutions that are highly available, scalable, stable, secure, and cost-effective.
- Mentor junior resources in team, demonstrate strong written and verbal communication skills and have curiosity with ability to learn new concepts/frameworks and technology rapidly as changes arise
Basic Qualifications
- Bachelor’s Degree in Computer Science, Information Systems Management, mathematics, or other related fields
- 5+ years of relevant experience in one of the following areas: Data engineering, database engineering, business intelligence or business analytics.
- 5+ Years of Data Warehouse/ Data lake Experience with Oracle, Redshift, PostgreSQL, Spark etc.
- Demonstrated strength and experience in SQL, python/pyspark scripting, data modeling, ETL development, and data warehousing
- 5+ years Architectural design or system design experience
- 5+ years of hands-on experience in writing complex, highly-optimized SQL queries across large data sets.
- 5+ years of experience in scripting languages like Python, Scala etc.
- Experience with Big Data Technologies (Hadoop, Hive, Hbase, Pig, Spark, etc.)
- Hands on experience in building big data solution using EMR/Elastic Search/EMR/S3/AWS Glue/Lambda/Redshift or equivalent MPP database
Preferred Qualifications
- Master’s degree in Computer Science, Computer Engineering or related technical discipline
- Experience providing technical leadership and mentoring other engineers for best practices on data engineering.
- Strong problem-solving skills and ability to prioritize conflicting requirements.
- Knowledge of software engineering best practices across the development lifecycle, including agile methodologies, coding standards, code reviews, source management, build processes, testing, and operations.
- Knowledge of one or more of the following areas: econometrics, statistical modeling, machine learning, data mining.
- Knowledge on data lake platform like Databricks, Azure, Aws with experience on distributed compute frameworks.
- Experience using business intelligence reporting tools (Quicksight, Tableau etc.)
- Experience with hardware provisioning, forecasting hardware usage, and managing to a budget
- Broad ability to take a project from scoping requirements through launch and operations of the project
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile AWS Azure Big Data Business Analytics Business Intelligence Computer Science Computer Vision Data analysis Databricks Data governance Data Mining Data warehouse Data Warehousing Distributed Systems Econometrics EDA ELT Engineering ETL Hadoop HBase Lambda Machine Learning Mathematics ML models MPP NLP Oracle PostgreSQL Privacy PySpark Python QuickSight R Redshift Robotics Scala Security Spark SQL Statistical modeling Statistics Tableau Testing Unstructured data
Perks/benefits: Career development
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Business Intelligence Engineer jobs
- Open Lead Data Analyst jobs
- Open MLOps Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Engineer II jobs
- Open Sr Data Engineer jobs
- Open Data Manager jobs
- Open Data Analytics Engineer jobs
- Open Principal Data Engineer jobs
- Open Power BI Developer jobs
- Open Junior Data Scientist jobs
- Open Business Intelligence Developer jobs
- Open Product Data Analyst jobs
- Open Senior Data Architect jobs
- Open Data Scientist II jobs
- Open Sr. Data Scientist jobs
- Open Manager, Data Engineering jobs
- Open Business Data Analyst jobs
- Open Big Data Engineer jobs
- Open Data Quality Analyst jobs
- Open Data Analyst Intern jobs
- Open Principal Data Scientist jobs
- Open Data Product Manager jobs
- Open ETL Developer jobs
- Open Junior Data Engineer jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open GCP-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Java-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open TensorFlow-related jobs
- Open PhD-related jobs
- Open CI/CD-related jobs
- Open NLP-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open Databricks-related jobs
- Open LLMs-related jobs
- Open Data warehouse-related jobs