Data Engineer

New York City, NY

Applications have closed

MongoDB

Get your ideas to market faster with a developer data platform built on the leading modern database. MongoDB makes working with data easy.

View company page

Find more jobs like this Jobs in the United States

Posted 1 year ago

The database market is massive (IDC estimates it to be $121B+ by 2025!) and MongoDB is at the head of its disruption. At MongoDB we are transforming industries and empowering developers to build amazing apps that people use every day. We are the leading modern data platform and the first database provider to IPO in over 20 years. Join our team and be at the forefront of innovation and creativity.

MongoDB is growing rapidly and seeking a Data Engineer to be a key contributor to the company’s Internal Data Platform. You will build ETL pipelines that pull data into our Data Lake/Warehouse and that will be used to drive forward our growth as a product and as a company. You will take on complex data-related problems using very diverse data sets, and will work with stakeholder groups throughout the company to help them make better data-informed decisions.

We are looking to speak to candidates who are based in New York City, NY.

Our ideal candidate has experience with

Building ETL pipelines at scale that can grow without sacrificing performance
Data Lake/Warehouse design patterns and concepts, including Delta Lakes
Several programming languages (Python, Scala, Java, etc.)
Data processing frameworks such as Spark and Pandas
Orchestration tools such as Airflow, Luiji, Azkaban, Cask, etc.
AWS services such as S3, Kinesis, EMR, Lambda, Athena, Glue, IAM, RDS, etc.
Different storage formats such as Parquet, JSON, Avro, and Arrow
Streaming data processing frameworks like Kafka, KSQL, and Spark Streaming
A diverse set of databases (MongoDB, Redshift, etc.)

You might be an especially great fit if you

Enjoy wrangling huge amounts of data and exploring new data sets
Value code simplicity and performance
Obsess over data: everything needs to be accounted for and be thoroughly tested
Plan effective data storage, security, sharing, and publishing within an organization
Constantly thinking of ways to squeeze better performance out of data pipelines

Nice to haves

You are deeply familiar with Spark and/or Hive
You have expert experience with Airflow
You understand the differences between different storage formats like Parquet, Avro, Arrow, and JSON and when to use each
You understand the tradeoffs between different schema designs like normalization vs. denormalization
In addition to data pipelines, you’re also quite good with Kubernetes, Drone, and Terraform
You’ve built an end-to-end production-grade data solution that runs on AWS or GCP
You have experience building machine learning pipelines using tools such as SparkML, Tensorflow, Scikit-Learn, etc.

Responsibilities

As a Data Engineer, you will

Build large-scale batch and real-time data pipelines with data processing frameworks including Spark and Kinesis
Help drive best practices in continuous integration and delivery
Help drive optimization, testing, and tooling to improve data quality
Collaborate with other software engineers, machine learning experts, and stakeholders, taking learning and leadership opportunities that will arise every single day

To drive the personal growth and business impact of our employees, we’re committed to developing a supportive and enriching culture for everyone. From employee affinity groups, to fertility assistance and a generous parental leave policy, we value our employees’ wellbeing and want to support them along every step of their professional and personal journeys. Learn more about what it’s like to work at MongoDB, and help us make an impact on the world!

MongoDB is committed to providing any necessary accommodations for individuals with disabilities within our application and interview process. To request an accommodation due to a disability, please inform your recruiter.

MongoDB, Inc. provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type and makes all hiring decisions without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.

Find more jobs like this Jobs in the United States

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Airflow Arrow Athena Avro AWS Azkaban Cask Data pipelines Data quality ETL GCP JSON Kafka Kinesis Kubernetes Lambda Luiji Machine Learning MongoDB Pandas Parquet Pipelines Python Redshift Scala Scikit-learn Security Spark SparkML Streaming TensorFlow Terraform Testing

Perks/benefits: Career development Fertility benefits Parental leave

Region: North America

Country: United States

Job stats: 8 2 0

Category: Engineering Jobs

More jobs like this

« Back to job search To the top ↑

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.

Data Engineer

New York City, NY

Applications have closed

MongoDB

Our ideal candidate has experience with

You might be an especially great fit if you

Nice to haves

Responsibilities

More jobs like this

AI Engineer Professional

Analytics Engineer

Robotics Engineering Lead: Special Projects

Principal Engineer - Machine Learning

Data Engineer Role

Senior Machine Learning Engineer (RAG)

Senior Machine Learning Engineer, Moloco Commerce Media

Senior Machine Learning Engineer

Data Engineer - Onsite in NYC

Senior BI Developer

Explore more AI, ML, Data Science career opportunities