Data Scientist/Senior Data Scientist - Defence & Security
London, United Kingdom
The Alan Turing Institute
Posted 8 months ago
The Alan Turing Institute is the UK’s national institute for data science and artificial intelligence. The Institute is named in honour of the scientist Alan Turing and its mission is to make great leaps in data science and artificial intelligence research in order to change the world for the better.
The Defence & Security programme at the Institute is forming a new team of data scientists in order to solve real-world problems aligned with securing the UK.
The team will collaborate with scholars across the institute’s research community to enhance the applicability of research for particular problems. It will work with our partners from across UK Government to turn their data challenges into research questions. The team will create software and scripts that implement research and apply it to client data in a readable, reliable and reproducible fashion. It will present conclusions of research and analysis to the research community and clients through presentations, research papers, and interactive data visualisations. It will work with state-of-the-art advanced high-performance computing and cloud platforms to realise collaborators' data science and artificial intelligence research at scale.
The team will support the dissemination of research outputs through the publication and maintenance of open source research software packages. It will contribute to the sustainability of the open source ecosystem by adding features, fixing bugs, maintaining tools, and supporting community management in new and existing packages, where appropriate.
Duties and Responsibilities
Successful candidates will:
1. Apply state-of-the-art and novel data science and artificial intelligence techniques emerging from the Institute and elsewhere to problems faced by the Turing’s partners
- Understand the problems of partners and develop appropriate approaches to solving these problems.
- Understand which data are, or might be, available; and collect and manage this data.
- Perform analyses, which might include: building statistical models; applying machine learning techniques; building models and simulations; or applying optimisation techniques.
- Document processes for effective and efficient reuse across multiple domains.
2. Collaborate with research colleagues to develop and maintain software embodying research outputs
- Develop a good understanding of the relevant theory and the needs of potential users of the software
- Be responsible for the programming effort, including design and planning
- Test and validate the software to a high-quality standard
3. Present, disseminate and explain our work
- Feedback the outcomes of analyses to clients and customers in the public, private, and third sectors in written form and in presentations.
- Share research in the practice of data science and artificial intelligence with the scholarly community through research papers and conferences.
- Publish, distribute, document and maintain research software packages.
4. Contribute to the life of the Institute and support its community
- Deliver teaching and training to colleagues and students, including within the team in our regular skills sessions.
- Support research colleagues to make the most of the institute’s secure high-performance computing environments for advanced research.
5. In addition, for senior staff only:
- Line manage 1-3 other staff within the group, supporting their career development aspirations.
Candidates must be able to demonstrate, through examples, the below capabilities:
- A PhD degree or equivalent professional experience in a field with significant use of both computer programming and advanced statistical or numerical methods.
- Experience managing, structuring, and analysing research data.
- Experience managing and organising the parameters and results of computational experiments.
- Fluency in one or more modern programming languages used in research in data science and artificial intelligence. (We particularly work in R, Python, and modern C++, but demonstrable use of other programming languages for research, together with a facility for learning new languages, is most welcome.)
- An understanding of the importance of good practices for producing reliable software and reproducible analyses (e.g. version control, issue tracking, automated testing, package management, literate analysis tools such as Jupyter and Rmarkdown)
- Demonstrated enthusiasm and ability to rapidly assimilate new computational and mathematical ideas and techniques on the job, at a more than superficial level, and apply them successfully.
- Excellent written and verbal communication skills, including experience in the visual representation of quantitative data, documentation of software packages or data resources, the authoring of research papers or technical reports, and giving presentations or classes on technical subjects.
- Ability to lead one’s own work independently, including planning and execution, and to collaborate productively as part of a team.
In addition, for senior staff only:
- Experience mentoring and evaluating the work of others (formal line management experience is not essential, but such applicants should be able to show significant evidence of informal mentorship.)
- Experience leading a project to a successful conclusion
- Demonstrable experience managing conflict and resolving stakeholder tensions
- EITHER Experience in making or evaluating the case for new projects (e.g. authoring or evaluating research proposals or business cases) OR Experience of managing, prioritising and resourcing a project portfolio.
We do not of course at all expect any candidate to have experience of all of the below! We are a learning team, combining many techniques and approaches to address our projects. Successful candidates will be able to demonstrate existing knowledge of more than one, depending on experience level, and, importantly, a commitment to develop new expertise in others.
- Machine learning, including experience with one or more established software libraries.
- Computational statistics, particularly Bayesian modelling.
- Visualisation for understanding large, complex, or high-dimensional data
- Knowledge management and ontology engineering, semantic web.
- Mathematical and computational modelling of complex systems.
- Logic, planning, verification, and automated reasoning.
- Programming language and API design. Domain specific languages.
- Exposure to mixed or qualitative research methods
- User interface design and development with web technologies, especially for data visualisation and knowledge representation.
- Writing technical documentation.
- Advanced numerical simulation (e.g. FEM, CFD…)
- Experience with public cloud platforms.
- Experience working with confidential and sensitive data for research.
- Developing for high-performance computing hardware (CUDA, MPI, OpenMP).
- Experience contributing to, maintaining and/or leading open source research software projects.
- Experience building open source communities.
- Working with databases and APIs for the acquisition of parameter information for models.
- Experience working with legacy code, especially in traditional scientific programming languages (eg, Fortran, MATLAB, C).
- Developing and/or delivering teaching and training in computational or mathematical methods for research.
- Developing and/or delivering teaching and training in applications of data science methods for non-programming experts.
- Automated testing, software quality assurance and continuous integration.
- Code review in a distributed team.