Data Scientist, Research Data Insights
USA, East Coast (Home based)
Full Time Clearance required USD 110K - 185K *
Digital Science
Department: Technical
Employment Type: Full Time
Location: USA, East Coast (Home based)
Description
About usWe are Digital Science and we are advancing the research ecosystem. We are a pioneering technology company, and our vision is of a future where a trusted and collaborative research ecosystem drives progress for all. We believe in better, open, collaborative and inclusive research. In creating the next generation of tools and working in partnership with the community we tackle some of the biggest challenges to research. In order to achieve our vision, we need innovative, inspiring and dynamic people to join our team. Want to join us?
Dimensions, part of the Digital Science family, is the world’s largest linked research information dataset, covering millions of research publications and connected by more than 1.3 billion citations. We are shaping the future of research and are looking for a Data Scientist to join the team.
Your new role
As part of a dynamic team environment you will support our global customers through the development of new analytic approaches and capabilities leveraging our scientometric data sets and emerging knowledge graph ecosystem. You will help our customers, including the largest funding and research organizations in the U.S. Federal government and beyond, to more effectively manage their multi-billion dollar research portfolios by providing delivery excellence that delights our customers, fuels word-of-mouth growth, and very high renewal rates. You will leverage our data and platforms, including Dimensions and the rest of Digital Sciences portfolio to support research assessment, portfolio management/analysis, strategic planning and more.
The role will touch all aspects of data analysis & delivery, from expanding and leveraging the Dimensions Knowledge Graph, to managing specialised analytic infrastructure resources in secure environments in support of specialised data indexing and analytic workloads, to data collection/wrangling, visualization, and the development & delivery of interactive dashboards and other applications. You will work closely with team members with a diversity of intellectual and professional backgrounds to harness our unique data and product capabilities to address our customer’s critical needs.
What you’ll be doing
- Conduct large-scale, quantitative data analysis (millions of records) potentially including custom indexing, data linking, data collection and other data wrangling using Dimensions in-house data assets and external or customer data sets as required.
- Leverage Large Language Models and other AI technologies to address customer analytical needs, identifying opportunities to incorporate these tools into analytic workflows and customer facing applications.
- Plan, design, maintain and document data integrations, pipelines, internal use utilities, tools and software packages to support our advanced analytic capabilities.
- Build machine-learning models that operate on large, text-based documents (10s - 100s of millions of documents), for a variety of applications including named entity resolution, relationship extraction, document clustering and topic modeling.
- Create and deploy visualizations and interactive web-based dashboards, using tools such as Plotly, Dash, and React.
What you’ll bring to the role
- You will have a good understanding of the S&T ecosystem - funders, research organizations, scientific publishing and related experience working with bibliometric/scientometric datasets such as scientific publications, grants, and patents.
- You will have familiarity with knowledge graphs (including technologies such as RDF and SPARQL). Ideally, you will experience building and querying knowledge graphs in support of analytic workloads leveraging bibliometric/scientometric data sets.
- You will have experience in Python, including relevant Python libraries and modules such as pandas, scikit learn, gensim, transformers, pyTorch and Dash.
- You will have familiarity with commercial AI models like GPT, Bard or Palm and ideally experience working with LLM support toolkits such as LangChain, Guidance, and Haystack.
- You’ll be experienced in Natural Language Processing and machine learning methods with bibliometric/scientometric datasets.
- You will have experience with data visualization tools (Plotly, D3, matplotlib etc)
- You will thrive in an environment where you can work independently and remotely
- You will have previous experience of working globally and across multiple teams
- You will be a strong communicator and able to communicate your findings to a varied audience through written and verbal presentation
- You will have 3-5 years of experience delivering customer solutions.
Additional Information
Current US Public Trust clearance preferred, as applicants will be subject to a security investigation and will need to meet eligibility requirements for access to sensitive information.
Living our Values
We invest in, nurture and support innovative businesses and technologies that make all parts of the research process more open, efficient and effective.The talent we secure is fundamental to us achieving our vision and our growth plans. The values we live by are:
We are brave in the pursuit of better We are collaborative and inclusive We are always open-minded We are from and for the community
We're an equal opportunity employer. All applicants will be considered for employment without attention to race, colour, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Bard Clustering D3 Data analysis Data visualization GPT Haystack LangChain LLMs Machine Learning Matplotlib NLP Pandas Pipelines Plotly Python PyTorch RDF React Research Scikit-learn Security Topic modeling Transformers
Perks/benefits: Career development Startup environment
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Data Science Manager jobs
- Open Marketing Data Analyst jobs
- Open Lead Data Analyst jobs
- Open Data Engineer II jobs
- Open Senior Business Intelligence Analyst jobs
- Open MLOps Engineer jobs
- Open Principal Data Engineer jobs
- Open Power BI Developer jobs
- Open Data Scientist II jobs
- Open Business Intelligence Developer jobs
- Open Data Analytics Engineer jobs
- Open Junior Data Scientist jobs
- Open Business Data Analyst jobs
- Open Sr Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Product Data Analyst jobs
- Open Sr. Data Scientist jobs
- Open Senior Data Architect jobs
- Open Big Data Engineer jobs
- Open Research Scientist jobs
- Open Azure Data Engineer jobs
- Open Principal Data Scientist jobs
- Open Data Quality Analyst jobs
- Open Manager, Data Engineering jobs
- Open Data Product Manager jobs
- Open Data quality-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open Business Intelligence-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open PhD-related jobs
- Open Deep Learning-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open PyTorch-related jobs
- Open APIs-related jobs
- Open TensorFlow-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open LLMs-related jobs
- Open CI/CD-related jobs
- Open Snowflake-related jobs
- Open Generative AI-related jobs
- Open Kubernetes-related jobs
- Open Hadoop-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Docker-related jobs