Sr Genomic Data Scientist (Hereditary Disease)

Redwood City, CA

Full Time Senior level / Expert
Tempus logo
Tempus
Apply now Apply later

Posted 1 month ago

Passionate about precision medicine and advancing the healthcare industry?

Recent advancements in genomics and computer technology have finally made it possible for AI to impact clinical care in a meaningful way. Tempus' proprietary platform connects an entire ecosystem of real-world evidence to deliver real-time, actionable insights to physicians, providing critical information about the right treatments for the right patients, at the right time.

We are seeking a genomics data scientist with interdisciplinary experience, including a track record of supporting innovative, high quality research by managing and modelling large volumes of clinical, genetic and/or genomic data and results in a distributed database and analytical environment. You will lead the data ingestion, organization, and implementation of analysis workflows for large-scale human cohorts with genetic and multi-dimensional, multi-modality phenotypic data. These will be obtained from public sources, as well as private datasets generated in-house and obtained through our collaborations.

Responsibilities

  • Bring in genomics/genetics datasets from external and internal sources to help develop internal resources for various analytical approaches
  • Prototype a robust data platform to efficiently house and represent critical human genetic, genomic, and clinical/phenotypic data, to inform genetic risk predictive models, cohort selection and clinical test validation across a range of disease areas.
  • Develop scalable and high quality analysis pipelines for clinical trials and clinical diagnostics products.
  • Leverage the opportunities and efficiencies afforded by access to hybrid cloud-based, distributed ecosystem of database technologies
  • Collaborate with other data scientists and statistical geneticists to leverage multimodal data in training polygenic risk scores, machine learning, and other predictive models.
  • Work with scientists and clinicians to design and perform analyses on clinical sequencing data  that generate clinically actionable insights in order to improve quality of care.
  • Communicate with internal and external scientific teams as well as product, science, and bioinformatics leadership.
  • Produce high quality and detailed documentation for all projects.

Required Experience

  • PhD/Masters or equivalent experience in genetics, biomedical informatics, or related life sciences areas.
  • 5+ years of experience in complex data analysis, architecture design, and familiarity with applications of FAIR principles
  • Hands-on development and maintenance of database systems and data manipulation using SQL, working within a POSIX CLI environment
  • Computational skills using Python (strongly preferred), Java, C/C++ or other programming languages.
  • Experience with complex longitudinal human clinical/phenotype data, e.g. from electronic health records, epidemiological cohorts, or clinical trials.
  • Experience with genetic and genomic data types, including public genetic databases and results data from high-throughput genetic assays (e.g. UK Biobank, Gnomad, etc.).

Ideal Candidates Will Possess

  • Experience with Python/Jupyter notebooks and/or R/Bioconductor in analyzing large data sets.
  • Experience mining modern, large-scale genetic databases (e.g. ExAC/gnomAD, UK Biobank, UK10K, EBI GWAS Catalog, 1KG, etc.).
  • Experience with distributed database technologies and related big-data analysis tools (e.g. Spark, BigQuery; the Apache Hadoop/Hive ecosystem). 
  • Experience with communicating insights and presenting concepts to a diverse audience.
  • Demonstrated knowledge in best-practice coding processes and data change control.
  • Experience in implementing and parallelizing pipelines in cloud computing environments.
  • Experience in analyzing large scale multimodal datasets to train and validate machine learning models.
  • Self-driven and works well in interdisciplinary teams.
  • Track record of publications.
#LI-BL1
Job tags: AI BigQuery Hadoop Healthcare Java Machine Learning Python R Research Spark SQL
Job region(s): North America
Share this job: