Data Engineer, Informatics & ML Platform

Somerville, MA USA

Applications have closed

Flagship Pioneering, Inc.

We are Flagship Pioneering We are a biotechnology company that invents platforms and builds companies that change the world. CEO Chats from the Flagship…

View company page

 

Company Summary:

What if you could join a rapidly growing company and play a critical role in bringing new medicines to patients through looking at and treating disease in a revolutionary way?

Cellarity's mission is to bring breakthrough medicines to patients by completely redefining the way drugs are discovered. Founded by Flagship Pioneering in 2017, Cellarity is designing medicines against the cell as opposed to a single molecular target. The company has developed a unique combination of expertise across network biology, chemistry, high-resolution data, and machine learning to unlock new treatment options in a vast array of disease areas. Cellarity currently has drug discovery programs underway in metabolic disease, hematology, immuno-oncology and respiratory disease. The company has raised $123 million as part of a Series B funding round with contributions from world renown investors such as Blackrock, The Baupost Group, Banque Pictet, alongside Flagship Pioneering.

 

What this position is all about:

Research Informatics & Data Engineering is part of an enterprise effort to enable data-driven science at Cellarity by building a robust technology platform.  This partner-centric group is embedded with stakeholders across Cellarity’s novel pipeline value chain from Computation & Data Science to Exploratory & Platform Biology and Medicinal Chemistry.    Our focus is to build an end-to-end operational platform bridging lab data generation and data science in an exploratory environment, ensuring data is democratized across the company.  We consistently strive to innovate, iterate, and improve our practices, while driving novel drug discovery at Cellarity.

The successful candidate will be responsible for advancing and optimizing our data infrastructure, architecture, integrations, and pipeline development, building a robust computational platform in collaboration with our bench and data scientists.

 

What you would be responsible for?

  • Design, implement, test, and maintain data pipelines for various workloads, including scientific data ingestion, platform integrations, instrument raw data processing, computational & data science workflows, ML model training, and inference at scale.
  • Develop well-documented production-ready code, working in a collaborative CI/CD development environment including use of git and participation in code reviews.
  • Design and implement high-quality testable APIs and microservices.
  • Implement and maintain databases for raw and processed scientific data from a variety of internal and external sources (e.g., partner and public repositories).
  • Design data models for entities, assays, and results from experiments and informatics pipelines in collaboration with bench and computational scientists.
  • Define, contribute to, and proactively communicate data engineering standards and practices establishing repeatable templates and frameworks and efficient usage of cloud services and tools.
  • Manage relationships and build solutions with external consultants/contractors and vendor engineers.
  • Innovate and advise on the latest technologies and standard methodologies in Data Engineering and be able to identify and implement effective technical solutions.
  • Assist in the management and administration of our AWS environment.

 

What experiences will you need?

  • BS/MS in Computer Science, Bioinformatics, Data Science, or a related discipline with 5+ years of software engineering experience.
  • 5+ years of hands-on Python development experience, Pythonic design and object-oriented programming.  Experience with R is a plus.
  • Demonstrated proficiency with workflow orchestration frameworks such as Prefect, Airflow, Nextflow, Snakemake, and AWS Step Functions; scientific data and NGS pipeline development a plus.
  • Demonstrated proficiency with cloud development (AWS strongly preferred) using infrastructure-as-code frameworks, computing services (ie AWS ECS, Batch, etc)
  • Proficiency with database engineering and optimization (ie PostgreSQL, GraphQL, Redshift, Aurora, etc)
  • Practical experience with data and metadata modeling, including alignment of optimized database design with metadata usage.
  • Proficiency with modern software development methodologies such as Agile, source control, project management, and issue tracking with JIRA.
  • Demonstrated ability to successfully work in cross-functional teams with an emphasis on teamwork, collaboration, and communication within the team and across the department

 

What will set you apart?

  • Professional AWS certifications.
  • Experience in building pipelines/workflows for biomedical, NGS, and/or high-throughput molecular profiling data.
  • Experience with Electronic Lab Notebook (ELN) & LIMS platforms.
  • Proficiency with container strategies using Docker, Fargate, and ECR.
  • Proficiency with Linux and shell scripting
  • Experience working with GxP and non-GxP data

 

What it’s like to work at Cellarity:

At Cellarity, we

  1. Push Boundaries: we create a legacy with breakthrough science in the service of patients
  2. Act with urgency: we work quickly and with conviction, and are eager to learn from data to iterate
  3. Own it: We transcend our job descriptions and relentlessly follow through on our commitments
  4. Tell it like it is: We give regular feedback on behaviors and are accountable for how we treat people
  5. Energize others: we are easy to work with and build strength from differing perspectives

 

Cellarity is committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status.

Recruitment & Staffing Agencies: Cellarity does not accept unsolicited resumes from any source other than candidates. The submission of unsolicited resumes by recruitment or staffing agencies to Cellarity or its employees is strictly prohibited unless contacted directly by Cellarity’s internal Talent Acquisition team. Any resume submitted by an agency in the absence of a signed agreement will automatically become the property of Cellarity, and Cellarity will not owe any referral or other fees with respect thereto.

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Agile Airflow APIs Architecture AWS Biology Chemistry CI/CD Computer Science Data pipelines Docker Drug discovery ECS Engineering Git GraphQL Jira Linux Machine Learning Microservices Model training OOP Pipelines PostgreSQL Python R Redshift Research Shell scripting

Region: North America
Country: United States
Job stats:  7  2  0

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.