Data Engineer

Remote, United States

Applications have closed

Labelbox

Discover how leading teams use Labelbox to build AI applications, train and fine-tune models, and automate tasks with LLMs

View company page

Labelbox’s mission is to build the best products for humans to advance artificial intelligence. Real breakthroughs in AI are reliant on the quality of the training data. Our training data platform enables organizations to improve their machine learning models far quicker and more accurately. We are determined to build software that is more open, easier-to-use, and singularly focused on getting our customers to performant ML faster.  
Current Labelbox customers are transforming industries within insurance, retail, manufacturing/robotics, healthcare, and beyond. Our platform is used by Fortune 500 enterprises including Allstate, Black + Decker, Bayer, Warner Brothers and leading AI-focused companies including FLIR Systems and Caption Health. We are backed by leading investors including SoftBank, Andreessen Horowitz, B Capital, Gradient Ventures (Google's AI-focused fund), Databricks Ventures, Snowpoint Ventures and Kleiner Perkins.
About the Role
Labelbox is hiring a Data Engineer to build new data pipelines and scale existing ones. As our company grows, this person will build data infrastructure that brings together tech, product, and operational functions and informs strategic decision making at the executive level. You will be responsible for transforming raw data in the data warehouse into clean, reliable, organized data models that allow our organization to make informed data-driven decisions. Our tech stack currently consists of Bigquery, DBT, and Looker along with other tools to replicate all of our data to our data warehouse.

What You'll Do

  • Develop and optimize large-scale batch and real-time data pipelines that ingest structured and unstructured data from a variety of sources using a combination DBT, Fivetran, and other tools
  • Build, rebuild and performance tune data transformation tasks within the central data store
  • Take over and scale our DBT and Looker setup
  • Manage incoming data requests and prioritize the highest value projects in an organized fashion
  • Communicate data-backed findings to a diverse constituency of internal and external stakeholders
  • Help create best practices and standards for data modeling, documentation, and testing
  • You will have strong autonomy designing and implementing operationally excellent data interfaces
  • Rigorously design data warehouse schemas to allow for performant access to digestible datasets
  • Become the analytics infrastructure and tooling expert, supporting business-focused pipelines and data interfaces
  • Data modeling, Data warehouse management, and Data orchestration

About You

  • Expert-level SQL skills
  • Experience in a role performing data warehouse and analytics solution design and development using a variety of techniques such as clustering and partitioning on tables over 1B rows
  • Understanding of data architecture design, data modeling, and physical database design and tuning
  • Hands-on experience in the implementation of cloud data warehouses using Bigquery, postgres, and Mysql databases
  • Experience using DBT
  • Knowledge of data visualization tools such as Looker
  • Hands-on coding experience in Python

Technology You’ll Use

  • Bigquery/GCS, Mysql, Postgres
  • DBT
  • Fivetran
  • Looker
  • Github
  • Jira

Do great work. From anywhere.
We hire great people regardless of where they live. Work wherever you’d like as reliable internet access is our only requirement. We communicate asynchronously, work autonomously, and take ownership of our work.
#LI-Remote

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: BigQuery Databricks Data pipelines Data visualization FiveTran GitHub Jira Looker Machine Learning ML models MySQL Pipelines PostgreSQL Python Robotics SQL Testing Unstructured data

Perks/benefits: Career development

Regions: Remote/Anywhere North America
Country: United States
Job stats:  20  4  0
Category: Engineering Jobs

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.