Data Engineer

United States

Applications have closed

Be Part of Building the Future

Dremio is the easy and open data lakehouse platform, providing an intuitive UI that enables data teams to get started with analytics in minutes, as well as the flexibility to use Dremio’s lightning-fast SQL query service and any other data processing engine. Dremio increases agility with a revolutionary data-as-code approach that adopts Git concepts to enable data experimentation, version control, and governance. In addition, Dremio breaks down data silos by simplifying ingestion into the lakehouse, and also allowing queries directly on databases and data warehouses. All of this is available through a fully managed service that not only eliminates the need to maintain infrastructure and software, but also automatically optimizes the data in the lakehouse to maximize performance for every workload.

About the role

Dremio’s development leaders ensure that Dremio Cloud & our Data Lake value-add for the industry is enhanced with scalable, resilient solutions with uptime & performance that matches SLAs. Dremio is growing quickly and building cloud infrastructure, SaaS & services that enable developer velocity will have an immediate and visible impact on Dremio’s success. You will be enabling data-driven decision making and customer engagement by creating a self-service semantic layer for the product and sales teams to leverage.

What you'll be doing

  • Creating and maintaining a data lake of customer and product usage metrics, which will be used to derive insights to drive product and growth strategies.
  • Design and implement workflows for ingestion and transformation for various data sources (S3, GCS, Google Analytics).
  • Optimize the retrieval of structured and unstructured data to make it actionable in real time.
  • Help develop a strategy for a long term data architecture, which will allow Dremio to make effective data-driven decisions to optimize the customer’s experience.
  • Develop and maintain scalable and reliable data pipelines to support gradual increases in data volume and complexity.
  • Collaborate with the Product Management and Engineering teams to incorporate new use cases and sources of data

What we're looking for

  • 5+ years of experience as a data engineer in a SaaS environment
  • Experience with the various stages of the data pipeline - ingesting and transforming data, designing a schema, building checks and redundancy, and adapting schemas based on new requirements.
  • Deep understanding  of data lakes and relational databases
  • Knowledge of data formats such as JSON and Parquet
  • Experience with Apache Spark, Python libraries such as Pandas for data manipulation 
  • Experience with AWS and GCP
  • Experience with ETL/ELT tools
  • Experience working with data projects and ensuring the highest levels of data integrity and quality.
  • You can scope, schedule, and resource complex projects in collaboration with other partners such as Product and Engineering.
  • Experience working with CI/CD pipelines, DevOps and delivering quality in a fast paced environment. 
  • Familiarity with BI and data science tools such as Tableau, Superset, Jupyter.
  • Excellent communication skills with both technical and non-technical audiences.

Bonus points if you have

  • Experience with Apache Iceberg

What we offer

  • Medical, dental and vision insurance 
  • 401(k) Plan
  • Short term / long term disability and life insurance
  • Pre-IPO stock options
  • Flexible PTO
  • 16 hours of volunteer time off
  • 12 company paid holidays, including Juneteenth
  • Remote work options
  • Monthly “Get Stuff Done” (GSD) Days
  • Paid parental leave
  • Employee Assistance Program (EAP)
  • Quarterly swag surprise

**Certain benefits are only allowed to full-time Dremio employees and may not be the same across all locations.

#LI-AR1 #LI-Remote

What we value 

At Dremio, we hold ourselves to high standards when it comes to People, Thinking, and Action. Our Gnarlies (that's what we call our employees) communicate with clarity, drive accountability, and are respectful towards each other. We confront brutal facts and focus on results while operating with a sense of urgency and building a "flywheel". People who like to jump in and drive momentum will thrive in our #GnarlyLife.

Dremio is an equal opportunity employer supporting workforce diversity. We do not discriminate on the basis of race, religion, color, national origin, gender identity, sexual orientation, age, marital status, protected veteran status, disability status, or any other unlawful factor.

Dremio is committed to providing any necessary accommodations for individuals with disabilities within our application and interview process. To request accommodation due to a disability, please inform your recruiter.

Dremio has policies in place to protect the personal information that employees and applicants disclose to us. Please click here to review the privacy notice. 

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Architecture AWS CI/CD Data pipelines DevOps ELT Engineering ETL GCP Git JSON Jupyter Pandas Parquet Pipelines Privacy Python RDBMS Spark SQL Superset Tableau Unstructured data

Perks/benefits: Career development Equity Flex hours Flex vacation Health care Insurance Medical leave Parental leave Salary bonus Startup environment

Regions: Remote/Anywhere North America
Country: United States
Job stats:  16  3  0
Category: Engineering Jobs

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.