Data Engineer, Machine Learning Data and Platform
Pittsburgh, PA, Palo Alto, CA, Austin TX
Who we are:
Argo AI is in the business of building self-driving technology you can trust. With experienced leaders in the field and collaborative partnerships with some of the world’s largest automakers, we’re building self-driving technology that is engineered to scale globally and transform mobility for millions.
Talented individuals join our team because they share our purpose to make it safe, easy, and enjoyable for everyone to get around cities. We aspire to impact key industries that move people and goods, from ride hailing to deliveries.
Meet the team:
The Machine Learning Infrastructure & Analytics (MLIA) team at Argo is responsible for delivering the platforms, tools, and services that power the ML workflows and Data Analysis needs of the organization. The Labeling Pipeline sub-team of MLIA builds and maintains the data pipelines and software tooling that drive Ground Truth Labeling, ML Training, Data Science, and Data Analytics at Argo. The team is responsible for pipeline health, data quality, lineage, and data lifecycle management. The pipelines, tools, and services provided by Labeling Pipeline are essential to the development of Machine Learning models by our Perception and Autonomy teams. This team is also responsible for data pipeline integration with our Analytics platform, which enables the creation of actionable insights from Argo’s data.
We are hiring an experienced Data Engineer to help deliver the data, tools, and services that form the foundation of our ML and Analytics platforms. This is a high visibility role that will provide the opportunity to learn about, interact with, and add significant value to every part of our company supporting our goal of building safe and efficient self-driving vehicles.
What you’ll do:
- Work with stakeholders in the Perception, Autonomy, and Operations teams to define data requirements
- Oversee the end to end architectural vision and focus on helping to address the most critical problems and needs we have
- Build automated and scalable end to end labeling pipelines for our ML model training
- Own initiatives from inception to implementation
- Work heavily with Python, Spark, SQL, Airflow, and Looker
- Work with modern cloud, big data, and deployment technologies (e.g. AWS, Kubernetes, Docker, etc.)
- Continually improve the quality, efficiency, and robustness of our labeling pipelines and tooling at Argo
- Follow and promote Labeling Pipeline and Machine Learning best practices across the organization
- Participate in code and architectural reviews
What you'll need to succeed:
- Degree in Computer Engineering, Computer Science, Electrical Engineering, Robotics or a related field
- 3+ years of experience in software development
- Strong team player that can collaborate and communicate effectively within and between teams
- Experience building highly scalable, reliable, and maintainable data pipelines
- Experience delivering software and systems that support Machine Learning, Deep Learning, or both
- Experience working with Job Scheduling tools (Airflow, Luigi, Oozie)
- Experience working with cloud infrastructure platforms (AWS, Azure, or GCP)
- Proficiency developing software with Python (experience with C++ is also highly desirable)
- Experience working with SQL and NoSQL database technologies
- Strong presentation and communication skills
What we offer you:
- High-quality individual and family medical, dental, and vision insurance
- Competitive compensation packages
- Employer-matched 401(k) retirement plan with immediate vesting
- Employer-paid group term life insurance and the option to elect voluntary life insurance
- Paid parental leave
- Paid medical leave
- Unlimited vacation
- Complimentary daily lunches, beverages, and snacks
- Pre-tax commuter benefits
- Monthly wellness stipend
- Professional development reimbursement
- Employee assistance program
- Discounted programs that include legal services, identity theft protection, pet insurance, and more
- Company and team bonding outlets: employee resource groups, quarterly team activity stipend, and wellness initiatives
Our Background:
Argo AI was founded in late 2016 by industry experts with extensive experience building robotic systems for commercial applications. Our once-small team has since grown into an over 1,000-person strong company with strategic partnerships with two of the world’s leading automakers: Ford and Volkswagen. Our self-driving system is the first with commercial deployment plans for Europe and the U.S., and thanks to an ability to tap into both automakers’ global reach, our technology platform has the largest geographic deployment potential of any self-driving technology to date.
At Argo AI, we believe that embracing differences delivers superior results. We are an equal opportunity employer that is committed to an inclusive environment for all employees.
Tags: Airflow AWS Azure Big Data Computer Science Data analysis Data Analytics Data pipelines Deep Learning Docker Engineering GCP Kubernetes Looker Machine Learning ML models Model training NoSQL Oozie Pipelines Python Robotics Spark SQL
Perks/benefits: 401(k) matching Career development Competitive pay Health care Insurance Lunch / meals Medical leave Parental leave Snacks / Drinks Team events Unlimited paid time off Wellness
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Lead Data Analyst jobs
- Open Data Science Manager jobs
- Open MLOps Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Engineer II jobs
- Open Sr Data Engineer jobs
- Open Data Manager jobs
- Open Principal Data Engineer jobs
- Open Data Analytics Engineer jobs
- Open Power BI Developer jobs
- Open Junior Data Scientist jobs
- Open Business Intelligence Developer jobs
- Open Data Scientist II jobs
- Open Senior Data Architect jobs
- Open Product Data Analyst jobs
- Open Sr. Data Scientist jobs
- Open Business Data Analyst jobs
- Open Manager, Data Engineering jobs
- Open Big Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Data Quality Analyst jobs
- Open Principal Data Scientist jobs
- Open Data Product Manager jobs
- Open Azure Data Engineer jobs
- Open Junior Data Engineer jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open ML models-related jobs
- Open GCP-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Java-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open TensorFlow-related jobs
- Open PhD-related jobs
- Open CI/CD-related jobs
- Open NLP-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open Databricks-related jobs
- Open LLMs-related jobs
- Open Data warehouse-related jobs