Staff Data Pipeline Engineer

Remote US, Remote Canada, Remote Toronto/Vancouver Area, Remote San Francisco Bay Area

Applications have closed

Mozilla

Mozilla is the not-for-profit behind the lightning fast Firefox browser. We put people over profit to give everyone more power online.

View company page

The company

Pocket empowers people to discover, organize, consume, and share content that matters to them. Our apps and platform are essential ways that tens of millions of people discover and consume content on the web. Pocket is the Web, curated: for you and by you.

The opportunity

For content recommendations, everything starts with data. Pocket’s Data Products team builds systems that combine machine learning with editorial expertise to surface high-quality content from across the internet. Ensuring data privacy when collecting, distributing, validating, and securing data at scale is no small task and every engineer on our team plays a vital role in shaping each user’s experience.

We are looking for a Lead Data Pipeline Engineer to own the design and development of data pipeline applications for complex, extensible, and highly scalable cloud-based data platforms. Are you passionate about building intuitive data models? Do you excel at taking vague requirements and crystallizing them into scalable data solutions? We invite you to apply!

People who excel on our team thrive in a small, dynamic environments. We cover many areas including machine learning, product engineering, machine learning operations, and data modeling, among others.

Who you are

  • Enjoy working on small, dynamic teams.
  • Understand Data Lifecycle and concepts such as lineage, governance, privacy, retention, anonymity, etc.
  • Conceptually familiar with AWS cloud resources (S3, EC2, RDS etc).
  • A trusted authority in distributed data processing patterns.
  • Highly proficient in at least one of Java, Python or Scala.
  • Comfortable with complex SQL
  • Experience designing, building, and maintaining data lakes.

What you'll do

  • Build and maintain data pipeline applications
  • Design, create and maintain the data platform data model at the conceptual, logical, and physical levels.
  • Establish data security, quality, load, transport and performance models.
  • Research, design, document and modify data pipeline software specifications throughout the production life cycle.
  • Develop and maintain stakeholder documentation and operations procedures, programs, security, etc. and assist in eliminating
  • redundancy and automating manual processes.
  • Assist in developing standards and criteria for the successful implementation of new systems.
  • Perform code reviews and mentor other engineers.

Bonus experience

  • Cloud warehouses: Snowflake, BigQuery, Redshift
  • Feature stores: Sagemaker, Databricks, Vertex
  • Orchestrators: Airflow, Prefect
  • Compute frameworks: AWS Glue, Spark, Hadoop, Athena
  • Streaming data: Kinesis, Kafka
  • Data modeling: DBT

About Pocket

We’re a remote-first team. Video conferencing, Slack chats, and shared documents keep everyone in the loop and make sure no one feels isolated. We value transparency and collaboration from the CEO on down.

As a subsidiary of Mozilla, we have the nimbleness of a small team with the resources of a large company, which means each teammate has the opportunity to make a big impact. But we make sure our working hours are flexible—not just because we have team members in different time zones—but because we know you have a life outside the office, and we value that. You’re human, we’re human, and everyone at Pocket is treated with utmost respect

Commitment to diversity, equity, inclusion, and belonging

Mozilla understands that valuing diverse creative practices and forms of knowledge are crucial to and enrich the company’s core mission. We encourage applications from everyone, including members of all equity-seeking communities, such as (but certainly not limited to) women, racialized and Indigenous persons, persons with disabilities, persons of all sexual orientations, gender identities and expressions.

We will ensure that qualified individuals with disabilities are provided reasonable accommodations to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment, as appropriate. Please contact us at hiringaccommodation@mozilla.com to request accommodation.

We are an equal opportunity employer. We do not discriminate on the basis of race (including hairstyle and texture), religion (including religious grooming and dress practices), gender, gender identity, gender expression, color, national origin, pregnancy, ancestry, domestic partner status, disability, sexual orientation, age, genetic predisposition, medical condition, marital status, citizenship status, military or veteran status, or any other basis covered by applicable laws. Mozilla will not tolerate discrimination or harassment based on any of these characteristics or any other unlawful behavior, conduct, or purpose.

Group: C

#LI-REMOTE

Tags: Airflow Athena AWS BigQuery Databricks EC2 Engineering Excel Hadoop Kafka Kinesis Machine Learning Python Redshift Research SageMaker Scala Security Snowflake Spark SQL Streaming

Perks/benefits: Career development Flex hours Salary bonus Transparency

Regions: Remote/Anywhere North America
Countries: Canada United States
Job stats:  8  0  0

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.