Staff Data Engineer
San Francisco, NY or Remote within the U.S.
Medium’s mission is to help people deepen their understanding of the world and discover ideas that matter. We are building a place where ideas are judged on the value they provide to readers, not the fleeting attention they can attract for advertisers. We are creating the best place for reading and writing on the internet—a place where today’s smartest writers, thinkers, experts, and storytellers can share big, interesting ideas.
We are looking for a Staff Data Engineer that will help build, maintain, and scale our business critical Data Platform. In this role, you will help define a long-term vision for the Data Platform architecture and implement new technologies to help us scale our platform over time. You'll also lead development of both transactional and data warehouse designs, mentoring our team of cross functional engineers and Data Scientists.
At Medium, we are proud of our product, our team, and our culture. Medium’s website and mobile apps are accessed by millions of users every day. Our mission is to move thinking forward by providing a place where individuals, along with publishers, can share stories and their perspectives. Behind this beautifully-crafted platform is our engineering team who works seamlessly together. From frontend to API, from data collection to product science, Medium engineers work multi-functionally with open communication and feedback.
What Will You Do
- Work on high impact projects that improve data availability and quality, and provide reliable access to data for the rest of the business.
- Drive the evolution of Medium's data platform to support near real-time data processing and new event sources, and to scale with our fast-growing business.
- Help define the team strategy and technical direction, advocate for best practices, investigate new technologies, and mentor other engineers.
- Design, architect, and support new and existing ETL pipelines, and recommend improvements and modifications.
- Be responsible for ingesting data into our data warehouse and providing frameworks and services for operating on that data including the use of Spark.
- Analyze, debug and maintain critical data pipelines.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Spark and AWS technologies.
- You have 7+ years of software engineering experience.
- You have 3+ years of experience writing and optimizing complex SQL and ETL processes, preferably in connection with Hadoop or Spark.
- You have outstanding coding and design skills, particularly in Java/Scala and Python.
- You have helped define the architecture, tooling, and strategy for a large-scale data processing system.
- You have hands-on experience with AWS and services like EC2, SQS, SNS, RDS, Cache etc or equivalent technologies.
- You have a BS in Computer Science / Software Engineering or equivalent experience.
- You have knowledge of Apache Spark, Spark streaming, Kafka, Scala, Python, and similar technology stacks.
- You have a strong understanding & usage of algorithms and data structures.
Nice To Have
- Snowflake knowledge and experience
- Looker knowledge and experience
- Dimensional modeling skills
At Medium, we foster an inclusive, supportive, fun yet challenging team environment. We value having a team that is made up of a diverse set of backgrounds and respect the healthy expression of diverse opinions. We embrace experimentation and the examination of all kinds of ideas through reasoning and testing. Come join us as we continue to change the world of digital media. Medium is an equal opportunity employer.
Interested? We'd love to hear from you.