Staff Data Engineer
New York
ASAPP
Elevate human performance using the power of AI. Achieve breakthrough results in customer experience by empowering your agents with integrated automation.The Data Engineering & Analytics team (DEA) at ASAPP powers the core of our data and analytics products. ASAPP's products are based on natural language processing and serve tens of millions of end-users in real time. We need sophisticated metrics to monitor and continuously improve our systems. We are seeking a Staff Data Engineer to serve as both a technical leader and a core individual contributor, by designing and building analytic data feeds for both our business partners and internal stakeholders.
Applicants with all or some relevant combination of the requirements listed below are encouraged to apply. This is a hybrid role, with a preference for candidates in proximity to either of our NYC or Mountain View offices
What you'll do
- Lead the batch analytics team by providing the groundwork to modernize our data analytics architecture
- Design and maintain our data warehouse to facilitate analysis across hundreds of systems events
- Rethink and influence strategy and roadmap for building efficient data solutions and scalable data warehouses
- Review code for style and correctness across the entire team
- Write production-grade Redshift, Athena, Snowflake & Spark SQL queries
- Manage and maintain Airflow ETL jobs
- Test query logic against sample scenarios
- Work across teams to gather requirements and understand reporting needs
- Investigate metric discrepancies and data anomalies
- Debug and optimize queries for other business units
- Review schema changes across various engineering teams
- Maintain high-quality documentation for our metrics and data feeds
- Work with stakeholders in Data Infrastructure, Engineering, Product and Customer Strategy to assist with data-related technical issues and build scalable cross platform reporting framework
- Participate in, and co-manage our on-call rotation to keep production pipelines up and running
What you'll need
- 7+ years industry experience with clear examples of strategic technical problem solving and implementation
- Expertise in at least one flavor of SQL. (We use Amazon Redshift, MySQL, Athena and Snowflake)
- Strong experience with data warehousing (e.g. Snowflake (preferred), Redshift, BigQuery, or similar)
- Experience with dimensional data modeling and schema design
- Experience using developer-oriented data pipeline and workflow orchestration (e.g. Airflow (preferred), dbt, dagster or similar)
- Experience with cloud computing services (AWS (preferred), GCP, Azure or similar)
- Proficiency in a high-level programming language, especially in terms of reading and comprehending other developers’ code and intentions. (We use Python, Scala, and Go)
- Deep technical knowledge of data exchange and serialization formats such as Protobuf, YAML, JSON, and XML
- Familiarity with BI & Analytics tools (e.g. Looker, Tableau, Sisense, Sigma computing or similar)
- Familiarity with streaming data technologies for low-latency data processing (e.g. Apache Spark/Flink, Apache Kafka, Snowpipe or similar)
- Familiarity with Terraform, Kubernetes and Docker
- Understanding of modern data storage formats and tools (e.g. parquet, Avro, Delta Lake)
- Knowledge of modern data design and storage patterns (e.g. incremental updates, partitioning and segmentation, rebuilds and backfills)
What we'd like to see
- Experience working at a startup preferred
- Excellent communication skills - (Slack/Email/Documents)
- Experienced with end user management & communication (cross team as well as external)
- Must thrive in a fast paced environment and be able to work independently with urgency
- Can work effectively remotely (able to be proactive about managing blockers, proactive on reaching out and asking questions, and participating in team activities)
- Experienced in writing technical data design docs (pipeline design, dataflow, schema design)
- Can scope and breakdown projects, communicate and collaborate progress and blockers effectively with your manager, team, and stakeholders
- Good at task management & capacity tracking (JIRA (preferred))
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow Architecture Athena Avro AWS Azure BigQuery CX Dagster Data Analytics Dataflow Data warehouse Data Warehousing dbt Docker Engineering ETL Flink GCP Generative AI Jira JSON Kafka Kubernetes Looker MySQL NLP Parquet Pipelines Python Redshift Scala Snowflake Spark SQL Streaming Tableau Terraform XML
Perks/benefits: Startup environment Team events
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Data Science Manager jobs
- Open MLOps Engineer jobs
- Open Lead Data Analyst jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Manager jobs
- Open Data Engineer II jobs
- Open Principal Data Engineer jobs
- Open Power BI Developer jobs
- Open Sr Data Engineer jobs
- Open Business Intelligence Developer jobs
- Open Junior Data Scientist jobs
- Open Data Analytics Engineer jobs
- Open Product Data Analyst jobs
- Open Data Scientist II jobs
- Open Business Data Analyst jobs
- Open Senior Data Architect jobs
- Open Sr. Data Scientist jobs
- Open Data Analyst Intern jobs
- Open Big Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Data Quality Analyst jobs
- Open Data Product Manager jobs
- Open Junior Data Engineer jobs
- Open Principal Data Scientist jobs
- Open Azure Data Engineer jobs
- Open GCP-related jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Data visualization-related jobs
- Open Finance-related jobs
- Open Deep Learning-related jobs
- Open PhD-related jobs
- Open APIs-related jobs
- Open TensorFlow-related jobs
- Open PyTorch-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open CI/CD-related jobs
- Open LLMs-related jobs
- Open Kubernetes-related jobs
- Open Generative AI-related jobs
- Open Data governance-related jobs
- Open Hadoop-related jobs
- Open Airflow-related jobs
- Open Docker-related jobs