Data Engineer
Remote, United States
Labelbox
Discover how leading teams use Labelbox to build AI applications, train and fine-tune models, and automate tasks with LLMsCurrent Labelbox customers are transforming industries within insurance, retail, manufacturing/robotics, healthcare, and beyond. Our platform is used by Fortune 500 enterprises including Allstate, Black + Decker, Bayer, Warner Brothers and leading AI-focused companies including FLIR Systems and Caption Health. We are backed by leading investors including SoftBank, Andreessen Horowitz, B Capital, Gradient Ventures (Google's AI-focused fund), Databricks Ventures, Snowpoint Ventures and Kleiner Perkins.
About the Role
Labelbox is hiring a Data Engineer to build new data pipelines and scale existing ones. As our company grows, this person will build data infrastructure that brings together tech, product, and operational functions and informs strategic decision making at the executive level. You will be responsible for transforming raw data in the data warehouse into clean, reliable, organized data models that allow our organization to make informed data-driven decisions. Our tech stack currently consists of Bigquery, DBT, and Looker along with other tools to replicate all of our data to our data warehouse.
What You'll Do
- Develop and optimize large-scale batch and real-time data pipelines that ingest structured and unstructured data from a variety of sources using a combination DBT, Fivetran, and other tools
- Build, rebuild and performance tune data transformation tasks within the central data store
- Take over and scale our DBT and Looker setup
- Manage incoming data requests and prioritize the highest value projects in an organized fashion
- Communicate data-backed findings to a diverse constituency of internal and external stakeholders
- Help create best practices and standards for data modeling, documentation, and testing
- You will have strong autonomy designing and implementing operationally excellent data interfaces
- Rigorously design data warehouse schemas to allow for performant access to digestible datasets
- Become the analytics infrastructure and tooling expert, supporting business-focused pipelines and data interfaces
- Data modeling, Data warehouse management, and Data orchestration
About You
- Expert-level SQL skills
- Experience in a role performing data warehouse and analytics solution design and development using a variety of techniques such as clustering and partitioning on tables over 1B rows
- Understanding of data architecture design, data modeling, and physical database design and tuning
- Hands-on experience in the implementation of cloud data warehouses using Bigquery, postgres, and Mysql databases
- Experience using DBT
- Knowledge of data visualization tools such as Looker
- Hands-on coding experience in Python
Technology You’ll Use
- Bigquery/GCS, Mysql, Postgres
- DBT
- Fivetran
- Looker
- Github
- Jira
We hire great people regardless of where they live. Work wherever you’d like as reliable internet access is our only requirement. We communicate asynchronously, work autonomously, and take ownership of our work.
#LI-Remote
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: BigQuery Databricks Data pipelines Data visualization FiveTran GitHub Jira Looker Machine Learning ML models MySQL Pipelines PostgreSQL Python Robotics SQL Testing Unstructured data
Perks/benefits: Career development
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Marketing Data Analyst jobs
- Open MLOps Engineer jobs
- Open Junior Data Scientist jobs
- Open Data Engineer II jobs
- Open AI Engineer jobs
- Open Senior Data Architect jobs
- Open Power BI Developer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Analytics Engineer jobs
- Open Sr Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Principal Data Engineer jobs
- Open Business Data Analyst jobs
- Open Product Data Analyst jobs
- Open Data Quality Analyst jobs
- Open Data Manager jobs
- Open Sr. Data Scientist jobs
- Open Big Data Engineer jobs
- Open Data Scientist II jobs
- Open Business Intelligence Developer jobs
- Open Data Analyst Intern jobs
- Open ETL Developer jobs
- Open Principal Data Scientist jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Business Intelligence-related jobs
- Open Data quality-related jobs
- Open Privacy-related jobs
- Open Data management-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open Deep Learning-related jobs
- Open APIs-related jobs
- Open PyTorch-related jobs
- Open Consulting-related jobs
- Open PhD-related jobs
- Open TensorFlow-related jobs
- Open Snowflake-related jobs
- Open NLP-related jobs
- Open Data governance-related jobs
- Open Data warehouse-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open Databricks-related jobs
- Open LLMs-related jobs
- Open DevOps-related jobs
- Open CI/CD-related jobs