Software Engineer, Data Infrastructure
Vancouver
Our mission is to bring community and belonging to everyone in the world. Reddit is a community of communities where people can dive into anything through experiences built around their interests, hobbies, and passions. With more than 50 million people visiting 100,000+ communities daily, it is home to the most open and authentic conversations on the internet. From pets to parenting, skincare to stocks, there’s a community for everybody on Reddit. For more information, visit redditinc.com.
This community of users generates 65B analytics events per day, each of which is ingested by the Data Platform team into a data warehouse that sees 55,000+ daily queries.
As a data infrastructure engineer, you will build and maintain the data infrastructure tools used by the entire company to generate, ingest, and access petabytes of raw data. A focus on performance and optimization will enable you to write scalable/fault tolerant code while collaborating with a team of top engineers. All while learning about and contributing to one of the most powerful streaming event pipelines in the world.
Not only will your work directly impact hundreds of millions of users around the world, but your output will also shape the data culture across all of Reddit!
How you will contribute:
- Refine and maintain our data infrastructure technologies to support real-time analysis of hundreds of millions of users.
- Consistently evolve data model & data schema based on business and engineering requirements.
- Own the data pipeline that surfaces 65B+ daily events to all teams, and the tools we use to improve data quality.
- Support warehousing and analytics customers that rely on our data pipeline for analysis, modeling, and reporting.
- Build data pipelines with distributed streaming tools such as Kafka, Kinesis, Flink, or Spark
- Ship quality code to enable scalable, fault-tolerant and resilient services in a multi-cloud architecture
Qualifications:
- 2+years of coding experience in a production setting writing clean, maintainable, and well-tested code.
- Experience with object-oriented programming languages such as Scala, Python, Go, or Java.
- Degree in Computer Science or equivalent technical field required.
- Experience with scaling large production systems is highly preferred.
- Experience working with any of the following technologies; Terraform, Helm, Prometheus, Docker, Kubernetes, Kafka, Spark, Flink and CI/CD.
- Excellent communication skills to collaborate with stakeholders in engineering, data science, machine learning, and product.
Reddit is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, please contact us at ApplicationAssistance@Reddit.com.
Tags: CI/CD Computer Science Data pipelines Docker Engineering Flink Helm Kafka Kinesis Kubernetes Machine Learning OOP Pipelines Python Scala Spark Streaming Terraform
Perks/benefits: Team events
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Lead Data Analyst jobs
- Open MLOps Engineer jobs
- Open Data Science Manager jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Manager jobs
- Open Data Engineer II jobs
- Open Power BI Developer jobs
- Open Principal Data Engineer jobs
- Open Sr Data Engineer jobs
- Open Data Analytics Engineer jobs
- Open Business Intelligence Developer jobs
- Open Junior Data Scientist jobs
- Open Data Scientist II jobs
- Open Product Data Analyst jobs
- Open Senior Data Architect jobs
- Open Sr. Data Scientist jobs
- Open Business Data Analyst jobs
- Open Big Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Manager, Data Engineering jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Data Quality Analyst jobs
- Open Junior Data Engineer jobs
- Open Principal Data Scientist jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open GCP-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Java-related jobs
- Open Privacy-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open TensorFlow-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open PhD-related jobs
- Open CI/CD-related jobs
- Open NLP-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open LLMs-related jobs
- Open Generative AI-related jobs
- Open Databricks-related jobs