Principal Software Engineer, Data Engineering
San Mateo, CA, United States
Roblox
Roblox is an immersive platform for communication and connection. Join millions of people and discover an infinite variety of immersive experiences created by a global community.Every day, tens of millions of people come to Roblox to explore, create, play, learn, and connect with friends in 3D immersive digital experiences– all created by our global community of developers and creators.
At Roblox, we’re building the tools and platform that empower our community to bring any experience that they can imagine to life. Our vision is to reimagine the way people come together, from anywhere in the world, and on any device. We’re on a mission to connect a billion people with optimism and civility, and looking for amazing talent to help us get there.
A career at Roblox means you’ll be working to shape the future of human interaction, solving unique technical challenges at scale, and helping to create safer, more civil shared experiences for everyone.
At Roblox, a deep understanding and measurement of users and creators' experience is critical to Roblox's rapid growth. The Analytical Data Engineering team is enabling Roblox's success through the development and maintenance of the Core Data model with an eye for scalability to support the analytical community and tooling to increase the speed at which we build data. As one of the founding members of the ADE team, you will define the data ontology for all of Roblox, define best practices and standards for the analytical community, define technical strategy for Roblox's ETL strategy including batch vs. streaming architecture, and influence event instrumentation.
As an Analytical Data Engineer you are familiar with supporting Data Science and Machine Learning workflows, and should leverage that knowledge to inform your design decisions and implementations. Our team's product will be the interface between data engineering and all other teams who will leverage the data to improve the Roblox platform and the experience of our users and creators alike.
You Will:
- Partner with science, product, and engineering to collect data requirements to define the Core Data Ontology for all of Roblox
- Lead a growing team of Analytical Data Engineers to support Roblox's ever-evolving data needs
- Design an extensible and scalable data model to support the ever growing analytical community
- Design, build, and maintain efficient and reliable data pipelines in batch and streaming to fuel the core data sets
- Apply ETL Frameworks to scale and extend functionality of the frameworks.
- Analyze the use cases for the data to determine appropriate SLAs
- Analyze the incoming data and upstream pipelines to determine and minimize epistemological issues.
- Determine appropriate relaxations to deterministic compute and leverage probabilistic data structures (bloom filters, count min sketch)
- Partner with the Data Platform Team to provide approximation algorithms (approximate nearest neighbor, etc.) for high use statistics of interest.
- Determine caching strategies and eviction policies to support cost-effective analysis
- Drive adoption of the Core Data tables and publicize new incoming datasets to ensure consistency across the organization
You Have:
- 8+ years of professional experience working with scalable ETL pipelines on industry standard ETL orchestration tools (i.e. Airflow, Luigi, Prefect, Dagster, digdag.io, Google Cloud Composer, AWS Step Functions, Azure Data Factory, UC4, Control-M).
- 3+ years working in the Hadoop Data Ecosystem for data processing
- 2+ years leading data engineering development directly with business or data science stakeholders.
- Built, scaled, and maintained Multi-Terabyte data sets.
- Experience with at least one major cloud's suite of offerings (AWS, GCP, Azure).
- Developed with Data Quality at the core of your pipelines (e.g. Great Expectations, Data Fold, etc.)
- Developed or enhanced ETL orchestrations tools
- Familiarity with Data Discovery tooling (e.g. Amundsen, Atlas)
- Worked within standard GitOps workflow (branch and merge, PRs, CI / CD systems)
- Familiarity with infrastructure configuration (IaC [e.g. Terraform], cluster parameter tuning, service parameter tuning)
You’ll Love:
- Industry-leading compensation package
- Excellent medical, dental, and vision coverage
- A rewarding 401k program
- Flexible vacation policy
- Roflex - Flexible and supportive work policy
- Roblox Admin badge for your avatar
- At Roblox HQ:
- Free catered lunches five times a week and several fully stocked kitchens with unlimited snacks
- Onsite fitness center and fitness program credit
- Annual CalTrain Go Pass
Roblox provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation and training.
Tags: Airflow Architecture AWS Azure Dagster Data pipelines Data quality Engineering ETL GCP Google Cloud Hadoop Machine Learning Pipelines Statistics Step Functions Streaming Terraform
Perks/benefits: Career development Equity Flex hours Flex vacation Health care Unlimited paid time off
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Lead Data Analyst jobs
- Open Data Science Manager jobs
- Open Data Manager jobs
- Open Data Engineer II jobs
- Open Senior Business Intelligence Analyst jobs
- Open MLOps Engineer jobs
- Open Principal Data Engineer jobs
- Open Business Intelligence Developer jobs
- Open Power BI Developer jobs
- Open Data Scientist II jobs
- Open Data Analytics Engineer jobs
- Open Business Data Analyst jobs
- Open Sr Data Engineer jobs
- Open Junior Data Scientist jobs
- Open Data Analyst Intern jobs
- Open Product Data Analyst jobs
- Open Sr. Data Scientist jobs
- Open Senior Data Architect jobs
- Open Big Data Engineer jobs
- Open Principal Data Scientist jobs
- Open Manager, Data Engineering jobs
- Open Data Quality Analyst jobs
- Open Azure Data Engineer jobs
- Open Research Scientist jobs
- Open Data Product Manager jobs
- Open Data quality-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open Business Intelligence-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open PhD-related jobs
- Open Deep Learning-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open PyTorch-related jobs
- Open TensorFlow-related jobs
- Open APIs-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open LLMs-related jobs
- Open CI/CD-related jobs
- Open Generative AI-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open Hadoop-related jobs
- Open Airflow-related jobs
- Open Docker-related jobs