Staff Data Engineer

New York City or Remote

Applications have closed

Catalyst

Customer success software that helps you centralize customer data, get a clear view of customer health, and scale experiences that drive retention & growth.

View company page

Catalyst Overview

Catalyst is a fast growing B2B SaaS company that’s helping organizations turn Customer Success into a company-wide vision.  Built by an experienced team of industry leaders, our software integrates with all of the tools that CS teams are already using to provide one centralized view of customer data.  Our modern and intuitive dashboards help CS leaders develop impactful workflows and take the right actions to understand health, prevent churn, increase adoption, and drive expansion.  

 

Position Overview

Insights and intelligence are the cornerstones of our product offering.  We ingest and process massive amounts of data from a variety of sources to help our users understand the overall health of their customers at each stage of their journey.  As a Staff Data Engineer, you will be directly responsible for designing and implementing the next generation data architecture, built on top of: Databricks, Fivetran and TiDB. 

 

What You’ll Do

  • Lead high impact, cross-functional data engineering projects built on top of a modern, best-in-class data stack, working with a variety of open source and Cloud technologies
  • Solve interesting and unique data problems at high volume and large scale  
  • Build and optimize the performance of batch, stream, and queue-based solutions including Kafka and Apache Spark
  • Define long term vision and strategies for data architecture (single/multi-tenancy, bi-directional real time sync, data models to support a variety of customer use cases, etc)
  • Collaborate with stakeholders from different teams to drive forward the data roadmap
  • Help define customer facing data-driven application features based on backend capabilities
  • Set and implement data retention, security and governance standards
  • Work with all engineering teams to establish the framework and best practices for ownership and self-serve data processing
  • Establish standards, guidelines, tooling and best practices for data engineering at Catalyst
  • Mentor other data engineers, and drive education an advocacy around data
  • Advocate for data quality, cost effective scalability, and distributed system reliability and establish automated mechanisms to improve these
  • Work cross functionally with application engineers, SRE, product, data analysts, data scientists, and ML engineers

 

 

What You’ll Need

  • 5+ years of experience successfully implementing modern data architectures
  • Strong Project Management skills
  • Demonstrated experience implementing ETL pipelines preferably with Apache Spark in Python and SQL
  • Python or other language proficiency
  • Deep understanding of SQL with relational data stores such as Postgres or Mysql
  • A strong desire to show ownership of problems you identify, and proven ability to empower others to get more done
  • Experience with Data Warehouses and Lakes such as Redshift, Snowflake, and Databricks Delta Lake
  • Experience with distributed streaming tools like Kafka and Spark Structured Streaming
  • Familiarity with workflow tools such as Airflow, dbt, and Delta Live tables
  • Experience with automated testing for distributed systems (unit testing, E2E testing, QA, CI/CD, data expectation monitoring)
  • Experience working with application engineers, product, and data scientists
  • Experience leading projects
  • Experience with additional data stores, preferably ElasticSearch

 

Why You’ll Love Working Here!

  • Monthly Mental Health Days and Mental Health Weeks twice per year 
  • Highly competitive compensation package, including equity 
  • Comprehensive benefits, including up to 100% paid medical, dental, & vision insurance coverage for you & your loved ones
  • Open vacation policy, encouraging you to take the time you need 
  • Annual education stipend, to ensure that you're continuously expanding your skill set
  • Monthly wellness stipend, to ensure that you’re taking care of both your physical & mental health
  • Monthly remote team-building events, including game nights, trivia, cooking/mixology classes, and more!

 

Salary information: The estimated base salary range for this position is $162,000-$220,000 USD. Additionally, we offer a competitive equity package and comprehensive benefits. Actual compensation is based on factors such as the candidate's skills, qualifications, experience and location.  

 

Catalyst is an equal opportunity employer, meaning that we do not discriminate based upon race, religion, national origin, gender identity, age, sexual orientation, or any other protected class. We believe that diversity is more than just good intentions, and we are committed to creating an inclusive environment for all employees.

 

      

Tags: Airflow Architecture CI/CD Databricks Data quality Distributed Systems Elasticsearch Engineering ETL FiveTran Kafka Machine Learning MySQL Open Source Pipelines PostgreSQL Python Redshift Security Snowflake Spark SQL Streaming Testing

Perks/benefits: Competitive pay Equity Flex vacation Health care Home office stipend Startup environment Team events Wellness

Regions: Remote/Anywhere North America
Country: United States
Job stats:  11  0  0

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.