Data Engineer

London, England, United Kingdom

Applications have closed

Our Future Health

We’re bringing together up to five million people to develop new ways to prevent, detect and treat diseases.

View company page

We are looking for Data Engineers to help solve some of the key challenges around a programme of work at industrial scale with global significance. The successful Data Engineer will be contributing towards the delivery of data releases that will be used worldwide and will have experience with either NHS data or Bioinformatics and genetic data.

This is a role with an inspiring set of data challenges, for what will eventually be a comprehensive view of the health of 5M people in the UK, including directly gathered information, genetics, NHS records and other linked data.

Our Future Health will be the UK’s largest ever health research programme, bringing people together to develop new ways to detect, prevent, and treat diseases. We are a charity, supported by the UK Government, in partnership with charities and industry. We work closely with the NHS and with public authorities across all nations and regions of the UK.

What you’ll be doing:

You’ll be part of a multidisciplinary team that’s creating pipelines that didn’t exist before, owning them in production and improving them over time. Your key responsibilities will include but not be limited to:

  • Supporting the build of data pipelines from data providers to our primary data store and trusted research environment.
  • Producing logic for data transformation steps as code, which meet the requirements for our end users and builds well curated, accessible and quality controlled data for analysis.
  • Developing prototypes for pipelines for complex transformations drawing on existing workflows developed in industry and academia.
  • Keeping abreast of best practice in data engineering across industry, research and Government and facilitating the adoption of standards.
  • Providing technical input into the upstream parts of the data pipeline, including the specification and transfer of data from data providers.
  • Ad-hoc data curation activities requiring hands on development of bespoke ETL cleaning scripts using languages such as SQL and Python.
  • Working with researchers to understand the data requirements and helping them to deliver the data needed for their projects.

What you won’t be doing:

  • Working in a siloed environment with no freedom to make decisions.
  • Working in a place where you can’t see the impact your expertise makes.

Requirements

To succeed in this role, you will have some of the following skills:

  • The ability to communicate to and between technical and non-technical stakeholders as well as facilitate discussions within a multidisciplinary team, managing different perspectives.
  • A good knowledge and understanding of NHS data such as hospital administrative data, disease registries or primary care data, and how they can be used to support research OR solid understanding and experience of bioinformatics, in particular tools and methods associated with genomic data.
  • Ability to design, build and test pipelines based on feeds from multiple systems using a range of different technologies. You will understand how to create repeatable and reusable products. 
  • Comfortable in designing an appropriate metadata repository and presenting changes to existing metadata repositories. You understand a range of tools for storing and working with metadata. 
  • Knowledge of health record coding systems and data standards (e.g., ICD, READ and SNOMED codes).
  • Proficient in a variety of data engineering programming languages and environments such as Python and SQL.
  • An understanding of the impact of emerging trends on the organisation in data tools, analysis techniques and data usage.
  • Understanding and working knowledge of information governance and data security approaches appropriate for sensitive health data.

Benefits

  • Up to £60,000 per annum basic salary.
  • Generous company pension package with employer contributions of up to 12%.
  • 30 days annual leave (plus bank holidays.)
  • Individual development budget
  • Flexible and remote working arrangements and a lovely new office in Holborn, Central London.

Join us - let’s prevent disease together.

Tags: Bioinformatics Data pipelines Engineering ETL Industrial Pipelines Python Research Security SNOMED SQL

Perks/benefits: Flex hours Health care

Regions: Europe North America
Job stats:  12  3  0
Category: Engineering Jobs

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.