Senior Data Engineer

Bengaluru, India

Applications have closed

Airbnb

Get an Airbnb for every kind of trip → 7 million vacation rentals → 2 million Guest Favorites → 220+ countries and regions worldwide

View company page

Find more jobs like this Jobs in India

Posted 2 years ago

Airbnb is a mission-driven company dedicated to helping create a world where anyone can belong anywhere. It takes a unified team committed to our core values to achieve this goal. Airbnb's various functions embody the company's innovative spirit and our fast-moving team is committed to leading as a 21st century company.

The Challenge

Airbnb’s mission is to create a world where people can Belong Anywhere. As we grow to achieve that mission. We are rebuilding our Data Engineering practice to enable the company’s success by building a solid data foundation. We are seeking stunning Senior Data Engineers to help us define and realize our vision for trustworthy data across the company. This is a unique opportunity to join Data Engineering for a strong, but high potential, company early in its lifecycle.

A Data Engineer is responsible for designing, developing, producing and owning a specific business domain’s core models. These data models are often intended to be used not only by members of that business domain, but also data consumers from across the company. Common uses for this data include metrics generation, analytics, experimentation, reporting, and ML feature generation.

Like all teams at Airbnb, we value and promote the diversity of our workforce, our guests, our hosts, our marketplace platform, and the world. Simply put, you belong at Airbnb.

We need to ensure every area of the business has trustworthy data to fuel insight and innovation. Understanding the business need, securing the right data sources, designing usable data models, and building robust & dependable data pipelines are essential skills to meet this goal.

At the same time, the technology used to create great data is continually evolving. We are moving to a reality where both batch & stream processing are leveraged to meet the latency requirements for the business. The Data Engineering paved path is still taking shape, and we want to collaboratively develop this to support the entire company. We need senior engineers who are passionate not only about the data, but also about improving the technology we leverage for Data Engineering.

We are looking for talented Senior Data Engineers who are excited about redefining what it means to do Data Engineering. Data Engineering is part of our Engineering org as we believe great Data Engineering depends on solid Software Engineering fundamentals. However, we also recognize that each Data Engineer has a unique blend of skills. Whether your strength is in data modeling or in stream processing, we want to talk to you.

What you’ll do

Define

Identify and gather the most frequent data consumption use cases for the datasets they are designing. A critical question we expect DEs to weigh in is whether existing data models can satisfy (or at least be used as foundation) prior to building new data.
Understand the impact of each requirement and use impact to inform prioritization decisions
Define data governance requirements (data access & privacy PII, retention etc.)
Define requirements for upstream data producers to satisfy intended data access patterns (including latency and completeness)

Design

Guide product design decisions to ensure the need for data timeliness, accuracy, and completeness are addressed
Design the data set required to support a specific business domain (i.e. the data model for that business domain)
Identify required data to support data model requirements
Work closely with online system engineers to influence the design of online data models (events and production tables) such that they meet the requirements of their offline data
Partner closely with data source owners (i.e. engineers and APIs and their owners) to specify and document data required to be ingested for the successful delivery of a requirement
Define and own schema of events for ingested data
Define and document data transformations that will need to be made to transform ingested data into data warehouse tables
Validate the data model meets Data Warehouse standards
Validate the data model integrates with adjacent data models
Document tables & columns for data consumers
Optimize data pipeline design to ensure compute and storage costs are efficient over time

Build

Implement data pipelines (streaming & batch) to execute required data transformations to meet design requirements
Validate incoming data to identify syntactic & semantic data quality issues
Validate data through analysis and testing to ensure the data produced meets the requirements specifications
Implement sufficient data quality checks (pre-checks, post-checks, anomaly detection) to preserve ongoing data quality
Partner with data consumers to validate resulting tables address the intended business need
Provide and solicit constructive peer review on data artifacts like pipeline design, data models, data quality checks, unit tests, code etc.

Maintain

Continually improve, optimize and tune their data pipelines for performance, storage & cost efficiency. Simplify existing systems and minimize complexity. Reduce data technical debt.

Actively deprecate low usage, low quality and legacy data and pipelines.

Triage and promptly correct data quality bugs. Implement additional data quality checks to ensure issues are detected earlier
Invest in automation where possible to reduce operational burden

Foundations, Citizenship & Stewardship

When building a tool to improve general DE workflow, contribute to development of tools to support company-wide Data Engineering Paved Path instead of local one-off solutions
Contribute to education of other DEs and data consumers on the data they curate for the business
Be data driven and influence data driven decisions. Be factual in communication, use data to effectively to tell stories, be critical of decisions not founded on data
Actively participate in recruiting, interviewing, mentoring and coaching. Champion the mission of Data Engineering by representing Airbnb at tech talks, blog posts, conferences, data meetups and communities

What you need to succeed

Not every Data Engineer will require all of these skills, but we expect most Data Engineers to be strong in a significant number of these skills to be successful at Airbnb.

Required 7+ Years of Experience
Data Product Management

Effective at building partnerships with business stakeholders, engineers and product to understand use cases from intended data consumers
Able to create & maintain documentation to support users in understanding how to use tables/columns

Data Architecture & Data Pipeline Implementation

Experience creating and evolving dimensional data models & schema designs to structure data for business-relevant analytics. (Ex: familiarity with Kimball's data warehouse lifecycle)
Strong experience using ETL framework (ex: Airflow, Flume, Oozie etc.) to build and deploy production-quality ETL pipelines.
Experience ingesting and transforming structured and unstructured data from internal and third-party sources into dimensional models.
Experience with dispersal of data to OLTP (ex: MySQL, Cassandra, HBase, etc) and fast analytics solutions (ex: Druid, ElasticSearch etc.).

Data Systems Design

Strong understanding of distributed storage and compute (S3, Hive, Spark)
Knowledge in distributed system design, such as how map-reduce and distributed data processing work at scale
Basic understanding of OLTP systems like Cassandra, HBase, Mussel, Vitess etc.

Coding

Experience building batch data pipelines in Spark
Expertise in SQL
General Software Engineering (e.g. proficiency coding in Python, Java, Scala)
Experience writing data quality unit and functional tests.

Aptitude to learn and utilize data analytics tools to accelerate business needs

Find more jobs like this Jobs in India

Tags: Airflow APIs Cassandra Data Analytics Data pipelines Elasticsearch Engineering ETL HBase Machine Learning MySQL Oozie Pipelines Python Scala Spark SQL Streaming Testing Unstructured data

Perks/benefits: Career development Conferences Team events

Region: Asia/Pacific

Country: India

Job stats: 14 0 0

Category: Engineering Jobs

More jobs like this

« Back to job search To the top ↑

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.

Senior Data Engineer

Bengaluru, India

Applications have closed

Airbnb

More jobs like this

Data Engineering Chapter Lead

Lead Data Engineer

Senior Software Engineer - AI Platform

Data Engineer Lead

Full Stack Developer - ReactJS/Typescript/NodeJS/JavaScript- Prompt Engineering

Senior AI & Data Engineer

Software Engineer III - Java, Spark and AWS

Machine Learning Engineer - REF6726T

Manager , Data Engineering and Integrations

Staff Machine Learning Engineer - REF7811C

Explore more AI, ML, Data Science career opportunities