Principal Software Engineer, Data Engineering

San Mateo, CA, United States

Applications have closed

Roblox

Roblox is the ultimate virtual universe that lets you create, share experiences with friends, and be anything you can imagine. Join millions of people and discover an infinite variety of immersive experiences created by a global community!

View company page

Every day, tens of millions of people come to Roblox to explore, create, play, learn, and connect with friends in 3D immersive digital experiences– all created by our global community of developers and creators. 

At Roblox, we’re building the tools and platform that empower our community to bring any experience that they can imagine to life. Our vision is to reimagine the way people come together, from anywhere in the world, and on any device. We’re on a mission to connect a billion people with optimism and civility, and looking for amazing talent to help us get there. 

A career at Roblox means you’ll be working to shape the future of human interaction, solving unique technical challenges at scale, and helping to create safer, more civil shared experiences for everyone.

At Roblox, a deep understanding and measurement of users and creators' experience is critical to Roblox's rapid growth. The Analytical Data Engineering team is enabling Roblox's success through the development and maintenance of the Core Data model with an eye for scalability to support the analytical community and tooling to increase the speed at which we build data. As one of the founding members of the ADE team, you will define the data ontology for all of Roblox, define best practices and standards for the analytical community, define technical strategy for Roblox's ETL strategy including batch vs. streaming architecture, and influence event instrumentation.

As an Analytical Data Engineer you are familiar with supporting Data Science and Machine Learning workflows, and should leverage that knowledge to inform your design decisions and implementations. Our team's product will be the interface between data engineering and all other teams who will leverage the data to improve the Roblox platform and the experience of our users and creators alike.

You Will:

  • Partner with science, product, and engineering to collect data requirements to define the Core Data Ontology for all of Roblox
  • Lead a growing team of Analytical Data Engineers to support Roblox's ever-evolving data needs
  • Design an extensible and scalable data model to support the ever growing analytical community
  • Design, build, and maintain efficient and reliable data pipelines in batch and streaming to fuel the core data sets
  • Apply ETL Frameworks to scale and extend functionality of the frameworks.
  • Analyze the use cases for the data to determine appropriate SLAs
  • Analyze the incoming data and upstream pipelines to determine and minimize epistemological issues.
  • Determine appropriate relaxations to deterministic compute and leverage probabilistic data structures (bloom filters, count min sketch)
  • Partner with the Data Platform Team to provide approximation algorithms (approximate nearest neighbor, etc.) for high use statistics of interest.
  • Determine caching strategies and eviction policies to support cost-effective analysis
  • Drive adoption of the Core Data tables and publicize new incoming datasets to ensure consistency across the organization

You Have:

  • 8+ years of professional experience working with scalable ETL pipelines on industry standard ETL orchestration tools (i.e. Airflow, Luigi, Prefect, Dagster, digdag.io, Google Cloud Composer, AWS Step Functions, Azure Data Factory, UC4, Control-M).
  • 3+ years working in the Hadoop Data Ecosystem for data processing
  • 2+ years leading data engineering development directly with business or data science stakeholders.
  • Built, scaled, and maintained Multi-Terabyte data sets.
  • Experience with at least one major cloud's suite of offerings (AWS, GCP, Azure).
  • Developed with Data Quality at the core of your pipelines (e.g. Great Expectations, Data Fold, etc.)
  • Developed or enhanced ETL orchestrations tools
  • Familiarity with Data Discovery tooling (e.g. Amundsen, Atlas)
  • Worked within standard GitOps workflow (branch and merge, PRs, CI / CD systems)
  • Familiarity with infrastructure configuration (IaC [e.g. Terraform], cluster parameter tuning, service parameter tuning)
For roles that are based at our headquarters in San Mateo, CA: The starting base pay for this position is as shown below. The actual base pay is dependent upon a variety of job-related factors such as professional background, training, work experience, location, business needs and market demand. Therefore, in some circumstances, the actual salary could fall outside of this expected range. This pay range is subject to change and may be modified in the future.  All full-time employees are also eligible for equity compensation and for benefits.Annual Salary Range$283,780—$331,640 USD

You’ll Love: 

  • Industry-leading compensation package
  • Excellent medical, dental, and vision coverage
  • A rewarding 401k program
  • Flexible vacation policy
  • Roflex - Flexible and supportive work policy 
  • Roblox Admin badge for your avatar
  • At Roblox HQ: 
    • Free catered lunches five times a week and several fully stocked kitchens with unlimited snacks
    • Onsite fitness center and fitness program credit
    • Annual CalTrain Go Pass

Roblox provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation and training.

Tags: Airflow Architecture AWS Azure Dagster Data pipelines Data quality Engineering ETL GCP Google Cloud Hadoop Machine Learning Pipelines Statistics Step Functions Streaming Terraform

Perks/benefits: Career development Equity Flex hours Flex vacation Health care Unlimited paid time off

Region: North America
Country: United States
Job stats:  5  0  0
Category: Engineering Jobs

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.