Data Engineer
Remote
Ginkgo Bioworks
In response to the global COVID-19 pandemic, we have launched a new division: Concentric by Ginkgo. We firmly believe that testing is an essential requirement to re-open the economy as quickly and safely as possible. Concentric is the largest provider of K-12 COVID-19 testing. We currently run state-level K-12 testing programs in more than 10 states and some of the nation’s largest school districts. To date, Concentric has sequenced more than 10,000 samples from a variety of sample types, in support of public health. As we continue to scale Concentric by Ginkgo, our work is also evolving and expanding into new and exciting directions.
To help facilitate the Concentric initiative - and help drive our biosecurity work forward - we’re looking for a talented Data Engineer to join our team. With your expertise, you’ll help to build our Concentric platform that brings more COVID-19 testing to more people, with optimized testing programs that work across test modalities and a growing lab network. You’ll collaborate with a team of talented and diverse engineers and product managers, as well as experts in data analysis, bioinformatics, and epidemiology. You’ll design and maintain data pipelines, ETL/ELT workflows, data marts, and high quality views that empower data driven decision making for Concentric and our customers. You’ll continuously innovate new capabilities for Concentric’s platform to be the center of long-term biosecurity, and be part of the growing robust bioinformatics capabilities within Concentric. This is your chance to develop high quality data models that can truly leave an impact on the world.
We use PostgreSQL and Snowflake along with Tableau for internal-facing data tools, with DBT coming soon for the transformation layer. Our backend tech stack includes AWS, Aptible, Docker, Flask, Kubernetes, Postgres, and other modern tools and frameworks. Our languages of choice are Python and Javascript. While we don't expect you to be an expert in every platform, we do expect that you have a solid foundation in data engineering including top notch SQL skills, database and data warehouse design, data quality and cleaning, and to be able to master the technologies that we use today (and will adopt in the future). We expect you to learn the meaning behind our data, and the ways our data can bring value to the business and our customers.
Responsibilities
- Develop scalable data pipelines and transformations to support continuing increases in data volume, complexity, and use cases
- Develop solutions using Snowflake, Airflow, and DBT
- Work with the data platform team to continuously improve our data tech stack to ensure high data availability, consistency, and quality.Write and maintain complex queries as required for implementing ETL/ELT
- Participate in data collection and schema design discussions to ensure data is captured in a way that supports downstream use cases
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc
- Work with analytics and stakeholder teams to support data-related technical issues and infrastructure needs.
Minimum Requirements
- Bachelor’s degree in associated field with 4+ years of relevant experience or Master’s degree with 3+ years of relevant experience
- Expert-level skills in SQL
- Prior industry experience preferred
Preferred Capabilities and Experience
- Experience using Python, DBT, Airflow, or equivalent technologies
- Proficiency with Snowflake and the AWS cloud platform
- Proficiency with Git or comparable source control, and CI/CD tools
- Familiarity with data regulations such as HIPAA, GDPR, and SOX
- Familiarity with or interest in MLops and implementing statistical/ML/AI models
- Strong written and verbal communication skills
- Passion for understanding the meaning behind our data
- Familiarity with public health, epidemiology, and bioinformatics, with a passion for Concentrics mission to transform epidemiological surveillance Please note: we don’t expect candidates to have all of these, so if you feel there is a good fit please consider applying!
To learn more about Ginkgo, check out some recent press:What is it really like to take your company public via a SPAC? One Boston biotech shares its journey (Fortune)Ginkgo Bioworks resizes the definition of going big in biotech, raising $2.5B in a record SPAC deal that weighs in with a whopping $15B-plus valuation (Endpoints News)Ginkgo Bioworks CEO on scaling up Covid-19 testing: ‘If we try, we can win’ (CNBC)Ginkgo raises $70 million to ramp up COVID-19 testing for employers, universities (Boston Globe)Ginkgo Bioworks Redirects Its Biotech Platform to Coronavirus (Wall Street Journal)Ginkgo Bioworks Provides Support on Process Optimization to Moderna for COVID-19 Response (PRNewswire)The Life Factory: Synthetic Organisms From This $1.4 Billion Startup Will Revolutionize Manufacturing (Forbes)Synthetic Bio Pioneer Ginkgo Raises $290 Million in New Funding (Bloomberg)Ginkgo Bioworks raises $350 million fund for biotech spinouts (Reuters)Can This Company Convince You to Love GMOs? (The Atlantic)
We also feel that it’s important to point out the obvious here – there’s a serious lack of diversity in our industry, and that needs to change. Our goal is to help drive that change. Ginkgo is deeply committed to diversity, equity, and inclusion in all of its practices, especially when it comes to growing our team. Our culture promotes inclusion and embraces how rewarding it is to work with people from all walks of life.
We’re developing a powerful biological engineering platform, so we must remain mindful of the many ways our technology can – and will – impact people around the world. We care about how our platform is used, and having a diverse team to build it gives us the best chance that it’s something we’ll be proud of as it continues to grow. Therefore, it’s critical that we incorporate the diverse voices and visions of all those who play a role in the future of biology.
It is the policy of Ginkgo Bioworks to provide equal employment opportunities to all employees and employment applicants.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow AWS Biology CI/CD Data analysis Data pipelines Data quality Data warehouse Docker ELT Engineering ETL Flask Git JavaScript Kubernetes Machine Learning MLOps Pipelines PostgreSQL Python Snowflake SQL Statistics Tableau Testing
Perks/benefits: Career development Startup environment
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Lead Data Analyst jobs
- Open Data Science Manager jobs
- Open Senior Business Intelligence Analyst jobs
- Open MLOps Engineer jobs
- Open Data Manager jobs
- Open Data Engineer II jobs
- Open Power BI Developer jobs
- Open Principal Data Engineer jobs
- Open Sr Data Engineer jobs
- Open Data Analytics Engineer jobs
- Open Business Intelligence Developer jobs
- Open Data Scientist II jobs
- Open Junior Data Scientist jobs
- Open Product Data Analyst jobs
- Open Senior Data Architect jobs
- Open Business Data Analyst jobs
- Open Sr. Data Scientist jobs
- Open Big Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Manager, Data Engineering jobs
- Open Azure Data Engineer jobs
- Open Junior Data Engineer jobs
- Open Data Product Manager jobs
- Open Data Quality Analyst jobs
- Open Principal Data Scientist jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open ML models-related jobs
- Open GCP-related jobs
- Open Data management-related jobs
- Open Java-related jobs
- Open Privacy-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open TensorFlow-related jobs
- Open PhD-related jobs
- Open CI/CD-related jobs
- Open Kubernetes-related jobs
- Open NLP-related jobs
- Open Data governance-related jobs
- Open LLMs-related jobs
- Open Airflow-related jobs
- Open Data warehouse-related jobs
- Open Hadoop-related jobs
- Open Databricks-related jobs