Data Engineer
Gurugram
MongoDB
Get your ideas to market faster with a developer data platform built on the leading modern database. MongoDB makes working with data easy.The database market is massive (the IDC estimates it to be $119B+ by 2025!) and MongoDB is at the head of its disruption. The MongoDB community is transforming industries and empowering developers to build amazing apps that people use every day. We are the leading modern data platform and the first database provider to IPO in over 20 years. Join our team and be at the forefront of innovation and creativity.
Headquartered in New York, with offices across North America, Europe, and Asia-Pacific, MongoDB has more than 17,000 customers, which include some of the largest and most sophisticated businesses in nearly every vertical industry, in over 100 countries.
MongoDB is growing rapidly and seeking a Data Engineer to be a key contributor to the overall internal data platform at MongoDB. You will build data-driven solutions to help drive MongoDB's growth as a product and as a company. You will tackle complex data-related problems using very diverse data sets.
Our ideal candidate has experience with
- Several programming languages (Python, Scala, Java, etc.)
- Data processing frameworks like Spark
- Streaming data processing frameworks like Kafka, KSQ, and Spark Streaming
- A diverse set of databases like MongoDB, Cassandra, Redshift, Postgres, etc.
- Different storage formats like Parquet, Avro, Arrow, and JSON
- AWS services such as EMR, Lambda, S3, Athena, Glue, IAM, RDS, etc.
- Orchestration tools such as Airflow, Luiji, Azkaban, Cask, etc.
- Git and Github
- CI/CD Pipelines
You might be a phenomenal fit if you
- Enjoy wrangling huge amounts of data and exploring new data sets
- Value code simplicity and performance
- Obsess over data: everything needs to be accounted for and be thoroughly tested
- Plan effective data storage, security, sharing, and publishing within an organization
- Constantly thinking of ways to squeeze better performance out of data pipelines
Nice to haves
- You are deeply familiar with Spark and/or Hive
- You have expert experience with Airflow
- You understand the differences between different storage formats like Parquet, Avro, Arrow, and JSON
- You understand the tradeoffs between different schema designs like normalization vs. denormalization
- In addition to data pipelines, you’re also quite good with Kubernetes, Drone, and Terraform
- You’ve built an end-to-end production-grade data solution that runs on AWS
- You have experience building machine learning pipelines using tools like: SparkML, Tensorflow, Scikit-Learn, etc.
Responsibilities
As a Data Engineer, you will
- Build large-scale batch and real-time data pipelines with data processing frameworks like Spark on AWS
- Help drive best practices in continuous integration and delivery
- Help drive optimization, testing, and tooling to improve data quality
- Collaborate with other software engineers, machine learning specialists, and partners, taking learning and leadership opportunities that will arise every single day
Success Measures
In 3 months- you will have familiarized yourself with much of our data platform, be making regular contributions to our codebase, will be collaborating regularly with partners to widen your knowledge and helping to resolve incidents and respond to user requests.
6 Months- you will have successfully investigated, scoped, executed, and documented a small to a medium-sized project and worked with partners to make sure their data needs are satisfied by implementing improvements to our platform.
12 Months- you will have become the key person for several projects within the team and will have contributed to the data platform’s roadmap. You will have made several sizable contributions to the project and are regularly looking to improve the overall stability and scalability of the architecture.
This role is remote optional until 10th January 2022, we are looking to speak to candidates who plan to be available in India Office when we introduce our hybrid model.
To drive the personal growth and business impact of our employees, we’re committed to developing a supportive and enriching culture for everyone. From employee affinity groups, to fertility assistance and a generous parental leave policy, we value our employees’ wellbeing and want to support them along every step of their professional and personal journeys. Learn more about what it’s like to work at MongoDB, and help us make an impact on the world!
MongoDB is committed to providing any necessary accommodations for individuals with disabilities within our application and interview process. To request accommodation due to a disability, please inform your recruiter.
MongoDB is an equal opportunities employer.
Tags: Airflow Arrow Athena Avro AWS Azkaban Cask Cassandra CI/CD Data pipelines Git GitHub JSON Kafka KSQ Kubernetes Lambda Luiji Machine Learning MongoDB Parquet Pipelines PostgreSQL Python Redshift Scala Scikit-learn Security Spark SparkML Streaming TensorFlow Terraform Testing
Perks/benefits: Career development Fertility benefits Parental leave
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Marketing Data Analyst jobs
- Open MLOps Engineer jobs
- Open Junior Data Scientist jobs
- Open AI Engineer jobs
- Open Data Engineer II jobs
- Open Senior Data Architect jobs
- Open Power BI Developer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Analytics Engineer jobs
- Open Sr Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Principal Data Engineer jobs
- Open Business Data Analyst jobs
- Open Product Data Analyst jobs
- Open Data Quality Analyst jobs
- Open Data Manager jobs
- Open Sr. Data Scientist jobs
- Open Big Data Engineer jobs
- Open Data Scientist II jobs
- Open Business Intelligence Developer jobs
- Open Data Analyst Intern jobs
- Open ETL Developer jobs
- Open Principal Data Scientist jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Business Intelligence-related jobs
- Open Data quality-related jobs
- Open Privacy-related jobs
- Open Data management-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open Deep Learning-related jobs
- Open APIs-related jobs
- Open PyTorch-related jobs
- Open PhD-related jobs
- Open Consulting-related jobs
- Open TensorFlow-related jobs
- Open Snowflake-related jobs
- Open NLP-related jobs
- Open Data governance-related jobs
- Open Data warehouse-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open Databricks-related jobs
- Open LLMs-related jobs
- Open DevOps-related jobs
- Open CI/CD-related jobs