Data Engineer

San Francisco

Applications have closed

Hive

Hive's APIs enable developers to integrate pre-trained AI models that address technically challenging content understanding needs into their applications.

View company page

Find more jobs like this Jobs in the United States

Posted 3 years ago

About Hive
Hive is a full-stack deep learning platform helping to bring companies into the AI era. We take complex visual challenges and build custom machine learning models to solve them. For AI to work, companies need large volumes of high quality training data. We generate this data through Hive Data, our proprietary data labeling platform with over 1,000,000 globally distributed workers, generating millions of high quality pieces of data per day. We then use this training data to build machine learning models for verticals such as Media, Autonomous Driving, Security, and Retail. Today, we work with some of the largest companies in the world to redefine how they think about unstructured visual data. Together, we build solutions that incorporate AI into their businesses to completely transform industries.
We are fortunate that investors like Peter Thiel (Founders Fund), General Catalyst, 8VC, and others see Hive's potential to be groundbreaking in AI business solutions. We have over 160 talented individuals globally in our San Francisco and Delhi offices. Please reach out if you are interested in joining the AI revolution!
Data Engineer Role
In order to execute our vision, we need to grow our team of best-in-class data engineers. We are looking for developers who conduct impeccable data practices and implement high quality data infrastructures. We value hard workers who are comfortable improvising solutions to big data challenges while building a system that can stand the test of time. Our ideal candidate has experience building data infrastructure from the ground up, contributes innovative ideas and ingenious implementations to the team, and is capable of planning out scalable, maintainable data pipelines.
As a data engineer, you would at first work primarily on our Hive Media product, taking real-time data from hundreds of television streams and turning them into a combination of real-time and scheduled outputs, especially our signature ads feed. Your work would improve the quality of our results while reducing computational cost and latency. Expect truly novel challenges.

Responsibilities

Writing scheduled Spark pipelines that perform sophisticated queries on the entirety of our datasets
Writing real-time pipelines that execute complex operations on incoming data
Synchronizing large amounts of data between unstructured and structured formats on various data sources
Creating testing and alerting for data pipelines
Building out our data infrastructure and managing dependencies between data pipelines
Defining and implementing metrics that provide visibility into our data quality

Requirements

You have an undergraduate and / or graduate degree in computer science or a similar technical field, with a sound understanding of statistics
You have 1-2 years of industry experience as a data engineer
You have hands-on experience doing ETL and have written data pipelines in either Spark, Hadoop, or similar technologies
You have a sound understanding of SQL
You have worked with data lakes such as S3 or HDFS
You have worked with various databases, such as Postgres, Cassandra, or Redshift before, and understand their pros and cons
You have a working knowledge of the following technologies, or are not afraid of picking them up on the fly: Mesos, Chronos, Marathon, Jenkins
You are fluent in at least one scripting language (preferably NodeJS or python) and one compiled language (such as Scala, Java, or C)
You have great communication skills and ability to work with others
You are a strong team player, with a do-whatever-it-takes attitude

What We Offer You
We are a group of ambitious individuals who are passionate about creating a revolutionary machine learning company. At Hive, you will have a significant career development opportunity and a chance to contribute to one of the fastest growing AI startups in San Francisco. The work you do here will have a noticeable and direct impact on the development of Hive.
Our benefits include competitive pay, equity, health / vision / dental insurance, catered lunch and dinner, a corporate gym membership, etc.
Thank you for your interest in Hive.

Find more jobs like this Jobs in the United States

Tags: Autonomous Driving Big Data Cassandra Computer Science Data pipelines Deep Learning ETL Hadoop HDFS Machine Learning ML models Node.js Pipelines PostgreSQL Python Redshift Scala Security Spark SQL Statistics Testing

Perks/benefits: Career development Competitive pay Fitness / gym Flex vacation Health care

Region: North America

Country: United States

Job stats: 28 3 0

Category: Engineering Jobs

More jobs like this

« Back to job search To the top ↑

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.

Data Engineer

San Francisco

Applications have closed

Hive

Responsibilities

Requirements

More jobs like this

Data ETL Engineer

Software Engineering Manager, Applied Machine Learning, Google Workspace

Software Engineer III, Machine Learning, Google Ads

Senior Software Engineer, Machine Learning, Google Cloud Networking

Software Engineer III, Google Cloud Data Management

Senior Software Engineer, Machine Learning, Gemini

Senior Software Engineer, Machine Learning, Android

Customer Engineer, Machine Learning, Google Cloud

Lead Machine Learning Engineer

Large Language Model (LLM) Engineering Lead - Vice President

Explore more AI, ML, Data Science career opportunities