Data Engineer, Informatics & ML Platform

Somerville, MA USA

Applications have closed

Flagship Pioneering, Inc.

We are Flagship Pioneering We are a biotechnology company that invents platforms and builds companies that change the world. CEO Chats from the Flagship…

View company page

Find more jobs like this Jobs in the United States

Posted 1 year ago

Company Summary:

What if you could join a rapidly growing company and play a critical role in bringing new medicines to patients through looking at and treating disease in a revolutionary way?

Cellarity's mission is to bring breakthrough medicines to patients by completely redefining the way drugs are discovered. Founded by Flagship Pioneering in 2017, Cellarity is designing medicines against the cell as opposed to a single molecular target. The company has developed a unique combination of expertise across network biology, chemistry, high-resolution data, and machine learning to unlock new treatment options in a vast array of disease areas. Cellarity currently has drug discovery programs underway in metabolic disease, hematology, immuno-oncology and respiratory disease. The company has raised $123 million as part of a Series B funding round with contributions from world renown investors such as Blackrock, The Baupost Group, Banque Pictet, alongside Flagship Pioneering.

What this position is all about:

Research Informatics & Data Engineering is part of an enterprise effort to enable data-driven science at Cellarity by building a robust technology platform. This partner-centric group is embedded with stakeholders across Cellarity’s novel pipeline value chain from Computation & Data Science to Exploratory & Platform Biology and Medicinal Chemistry. Our focus is to build an end-to-end operational platform bridging lab data generation and data science in an exploratory environment, ensuring data is democratized across the company. We consistently strive to innovate, iterate, and improve our practices, while driving novel drug discovery at Cellarity.

The successful candidate will be responsible for advancing and optimizing our data infrastructure, architecture, integrations, and pipeline development, building a robust computational platform in collaboration with our bench and data scientists.

What you would be responsible for?

Design, implement, test, and maintain data pipelines for various workloads, including scientific data ingestion, platform integrations, instrument raw data processing, computational & data science workflows, ML model training, and inference at scale.
Develop well-documented production-ready code, working in a collaborative CI/CD development environment including use of git and participation in code reviews.
Design and implement high-quality testable APIs and microservices.
Implement and maintain databases for raw and processed scientific data from a variety of internal and external sources (e.g., partner and public repositories).
Design data models for entities, assays, and results from experiments and informatics pipelines in collaboration with bench and computational scientists.
Define, contribute to, and proactively communicate data engineering standards and practices establishing repeatable templates and frameworks and efficient usage of cloud services and tools.
Manage relationships and build solutions with external consultants/contractors and vendor engineers.
Innovate and advise on the latest technologies and standard methodologies in Data Engineering and be able to identify and implement effective technical solutions.
Assist in the management and administration of our AWS environment.

What experiences will you need?

BS/MS in Computer Science, Bioinformatics, Data Science, or a related discipline with 5+ years of software engineering experience.
5+ years of hands-on Python development experience, Pythonic design and object-oriented programming. Experience with R is a plus.
Demonstrated proficiency with workflow orchestration frameworks such as Prefect, Airflow, Nextflow, Snakemake, and AWS Step Functions; scientific data and NGS pipeline development a plus.
Demonstrated proficiency with cloud development (AWS strongly preferred) using infrastructure-as-code frameworks, computing services (ie AWS ECS, Batch, etc)
Proficiency with database engineering and optimization (ie PostgreSQL, GraphQL, Redshift, Aurora, etc)
Practical experience with data and metadata modeling, including alignment of optimized database design with metadata usage.
Proficiency with modern software development methodologies such as Agile, source control, project management, and issue tracking with JIRA.
Demonstrated ability to successfully work in cross-functional teams with an emphasis on teamwork, collaboration, and communication within the team and across the department

What will set you apart?

Professional AWS certifications.
Experience in building pipelines/workflows for biomedical, NGS, and/or high-throughput molecular profiling data.
Experience with Electronic Lab Notebook (ELN) & LIMS platforms.
Proficiency with container strategies using Docker, Fargate, and ECR.
Proficiency with Linux and shell scripting
Experience working with GxP and non-GxP data

What it’s like to work at Cellarity:

At Cellarity, we

Push Boundaries: we create a legacy with breakthrough science in the service of patients
Act with urgency: we work quickly and with conviction, and are eager to learn from data to iterate
Own it: We transcend our job descriptions and relentlessly follow through on our commitments
Tell it like it is: We give regular feedback on behaviors and are accountable for how we treat people
Energize others: we are easy to work with and build strength from differing perspectives

Cellarity is committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status.

Recruitment & Staffing Agencies: Cellarity does not accept unsolicited resumes from any source other than candidates. The submission of unsolicited resumes by recruitment or staffing agencies to Cellarity or its employees is strictly prohibited unless contacted directly by Cellarity’s internal Talent Acquisition team. Any resume submitted by an agency in the absence of a signed agreement will automatically become the property of Cellarity, and Cellarity will not owe any referral or other fees with respect thereto.

Find more jobs like this Jobs in the United States

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Agile Airflow APIs Architecture AWS Biology Chemistry CI/CD Computer Science Data pipelines Docker Drug discovery ECS Engineering Git GraphQL Jira Linux Machine Learning Microservices Model training OOP Pipelines PostgreSQL Python R Redshift Research Shell scripting

Region: North America

Country: United States

Job stats: 7 2 0

Categories: Engineering Jobs Machine Learning Jobs MLOps Jobs

More jobs like this

« Back to job search To the top ↑

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.

Data Engineer, Informatics & ML Platform

Somerville, MA USA

Applications have closed

Flagship Pioneering, Inc.

More jobs like this

Senior Software Engineer, Machine Learning

Data Engineer III

Senior Software Engineer, Data Platform (Contract)

Staff Engineer, ML Infrastructure (Technical Leader)

Software Engineer III, Machine Learning, Pixel

Principal Machine Learning Engineer

Sr. Data Engineer

Senior Software Engineer, Generative AI, Google Cloud AI

Machine Learning Engineer (LLM Infrastructure) - Halifax

Staff Software Engineer - Real Time Systems, Mission Autonomy

Explore more AI, ML, Data Science career opportunities