Staff AI Data Engineer
LAKE FOREST, IL, US, 60045-5202
Grainger
As a leading industrial distributor with operations primarily in North America, Japan and the United Kingdom, We Keep The World Working® by serving more than 4.5 million customers worldwide with products delivered through innovative technology and deep customer relationships. With 2023 sales of $16.5 billion, we’re dedicated to providing value for customers, fostering an engaging culture for team members and driving strong financial results.
Our welcoming workplace enables you to learn, grow and make a difference by keeping businesses running and their people safe. As a 2024 Glassdoor Best Place to Work and a Great Place to Work-Certified™ company, we’re looking for passionate people to join our team as we continue leading the industry over our next 100 years.
Position Details:
This position at Grainger is focused on transforming both traditional relational data as well as other more complex multimodal sources like voice, pdf, or images for downstream systems that address important business needs. Your primary focus will be building pipelines for use in LLMs that are reliable, scalable, and efficient. You will play an important part in defining the strategy of the team, evaluating, and integrating data patterns and technologies, and building pipelines alongside domain experts and data scientists. You are a thoughtful observer who enjoys investigating business problems and building data solutions that address them.
You are a technical teacher that can guide teams to adopt the capabilities and products you build.
You Will:
- As a technical lead, your primary responsibility will be to design and implement highly efficient, reusable, and scalable data processing systems and pipelines to consume both relational and complex multimodal data.
- Design and implement technical solutions and processes to ensure data reliability and accuracy.
- Build pipelines that feed embedding models and vector databases while working with platform-oriented teams to ensure that database response times meet expectations.
- Develop data models and mappings and build new data assets required by data science teams. Perform exploratory data analysis on existing products and datasets.
- Educate other data engineers in adopting new patterns and tools.
- Understand trends and emerging technologies and evaluate the performance and applicability of potential tools for our requirements.
- Work within an Agile delivery / DevOps methodology to deliver product increments in iterative sprints.
- Function as SME within this area when engaging with our AI, Platform, and Business Analytics teams to build useful pipelines to address ML/AI needs.
- Work with product and business to define roadmap, communication, and architecture.
- Mentor junior team members.
You Have:
- 8+ years of experience in batch and streaming ETL using Spark, Python, Scala, Snowflake or Databricks for Data Engineering or Machine Learning workloads.
- 5+ years orchestrating and implementing pipelines with workflow tools like Databricks Workflows, Apache Airflow, or Luigi
- 3+ years of experience prepping structured and unstructured data for data science models.
- 3+ years of experience with containerization and orchestration technologies (Docker, Kubernetes) and experience with shell scripting in Bash, Unix or windows shell is preferable.
- Experience working in vector databases like Milvus, Pinecone, or Weaviate
- Experience using machine learning in data pipelines to discover, classify, and clean data.
- Implemented CI/CD with automated testing in Jenkins, Github Actions, or Gitlab CI/CD
- Familiarity with AWS Services not limited to Glue, Athena, Lambda, S3, and DynamoDB
- Demonstrated experience implementing data management life cycle, using data quality functions like standardization, transformation, rationalization, linking and matching.
Rewards and Benefits:
With benefits starting day one, our programs provide choice and flexibility to meet team members' individual needs. Check out the highlights below and review all our benefits at GraingerTotalRewards.com.
- Medical, dental, vision, life, and pet insurance plans and 6 free sessions each year with a licensed therapist to support your emotional wellbeing
- Paid time off (PTO) and 6 company holidays per year
- 6% company contribution to a 401(k) Retirement Savings Plan each pay period, no match required
- Employee discounts, tuition reimbursement, student loan refinancing and free access to financial counseling, education and tools
- Maternity support programs, nursing benefits, and up to 14 weeks paid leave for birth parents and up to 4 weeks paid leave for non-birth parents
We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender, gender identity or expression, or veteran status. We are proud to be an equal opportunity workplace.
We are committed to fostering an inclusive, accessible environment that includes both providing reasonable accommodations to individuals with disabilities during the application and hiring process as well as throughout the course of one’s employment. With this in mind, should you need a reasonable accommodation during the application and selection process, please advise us so that we can provide appropriate assistance.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Airflow Architecture Athena AWS Business Analytics CI/CD Data analysis Databricks Data management Data pipelines Data quality DevOps Docker DynamoDB EDA Engineering ETL GitHub GitLab Industrial Kubernetes Lambda LLMs Machine Learning Pinecone Pipelines Python Scala Shell scripting Snowflake Spark Streaming Testing Unstructured data Weaviate
Perks/benefits: 401(k) matching Career development Health care Insurance Medical leave
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open MLOps Engineer jobs
- Open Data Science Manager jobs
- Open Data Manager jobs
- Open Lead Data Analyst jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Engineer II jobs
- Open Principal Data Engineer jobs
- Open Power BI Developer jobs
- Open Sr Data Engineer jobs
- Open Data Scientist II jobs
- Open Data Analytics Engineer jobs
- Open Business Intelligence Developer jobs
- Open Product Data Analyst jobs
- Open Junior Data Scientist jobs
- Open Business Data Analyst jobs
- Open Sr. Data Scientist jobs
- Open Data Analyst Intern jobs
- Open Senior Data Architect jobs
- Open Big Data Engineer jobs
- Open Principal Data Scientist jobs
- Open Junior Data Engineer jobs
- Open Azure Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Data Product Manager jobs
- Open Data Quality Analyst jobs
- Open Data quality-related jobs
- Open GCP-related jobs
- Open Business Intelligence-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Data visualization-related jobs
- Open PhD-related jobs
- Open Finance-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open APIs-related jobs
- Open TensorFlow-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open LLMs-related jobs
- Open CI/CD-related jobs
- Open Generative AI-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open Hadoop-related jobs
- Open Airflow-related jobs
- Open DevOps-related jobs