Linguistic Data Manager

Gothenburg, Sweden or Remote

Recorded Future, Inc.

Recorded Future is the most comprehensive and independent threat intelligence platform. Identify and mitigate threats across cyber, supply-chain, physical and fraud domains.

View company page

With 900 employees, over $200M in sales, 1,400+ clients, and rapid year-over-year growth, Recorded Future is the world’s most advanced, and largest, intelligence company!

Linguistic Data Manager

Location: Gothenburg, Sweden or Remote

We are looking for an outstanding linguistic data manager to help us leverage text and NLP modeling techniques to find threat patterns and signals in vast amounts of texts. This position requires an organized person who has extensive experience in creating high quality datasets for training text based machine learning models, and solid knowledge in using linguistic and statistical methods in order to align datasets to models and use cases. You should have programming experience in python and feel comfortable in a technical environment. 

This role is part of the Analytics team in the R&D Organization.  You will work in a highly effective and collaborative team dedicated to building an NLP system which understands threat-signals from text as well as a human analyst.

You would get the chance to develop great software in a fast-moving environment close to our global clients and their needs. Together, we’ll improve our product to meet increasing demands while focusing on scalability and quality. You will also get a chance to experiment with new technology and explore what solutions are most suitable for solving real-world problems. You would join a dynamic team eager to take on new challenges and are passionate about what they do.

Your responsibilities will include:

  • Machine Learning:
    • Ownership of our text based machine learning models
    • Developing machine learning models
    • Understanding the impact of models in our product
  • Manage Sources & Data:
    • Creating training datasets of the highest quality
    • Understanding the lifecycle of our data sources
    • Using linguistic and statistical tools for error analysis and dataset consistency
    • Plan and coordinate the creation of tools for acquiring new training data
    • Ownership of text training data sets and the relative documentation
  • Lead Annotation Process 
    • annotation guidelines
    • Adjudication
    • Continuous Monitoring
    • Vendor Management
  • Serve as  Linguistic Expert:
    • Defining new linguistic resources needed to expand Recorded Future’s NLP products
    • Locate existing linguistic open source or proprietary resources (datasets & text analysis tools)
    • Together with product management define new events and entity categories
    • Clarify linguistic requirements and dependencies to other teams
    • Performing quality assessment of our current linguistic performance and resources

Requirements: 

  • Bachelor degree in computational linguistics, machine learning, computer science or similar discipline.
  • Experience in the data lifecycle process for NLP datasets.
  • Comfortable in Python/Java/Scala 
  • Experience in tooling for handling data and data analysis
  • Experience with Spacy, Pytorch, git, elasticsearch, mongodb is a plus

Diversity has been essential to Recorded Future since day one, and that is clearly visible in all teams in the company, yet there’s always room for improvement. In essence, we welcome applications that will help us improve this even more.

#LI-Remote

Why should you join Recorded Future?
Recorded Future employees (or “Futurists”), represent over 40 nationalities and embody our core values of having high standards, practicing inclusion, and acting ethically. Our dedication to empowering clients with intelligence to disrupt adversaries has earned us a 4.7-star user rating from Gartner and 8 of the top 10 Fortune 100 companies as clients.

Want more info? 
Blog & Podcast: Learn everything you want to know (and maybe some things you’d rather not know) about the world of cyber threat intelligence
Instagram & Twitter: What’s happening at Recorded Future
The Record: The Record is a cybersecurity news publication that explores the untold stories in this rapidly changing field
Timeline: History of Recorded Future
Recognition: Check out our awards and announcements

We are committed to maintaining an environment that attracts and retains talent from a diverse range of experiences, backgrounds and lifestyles.  By ensuring all feel included and respected for being unique and bringing their whole selves to work, Recorded Future is made a better place every day.

If you need any accommodation or special assistance to navigate our website or to complete your application, please send an e-mail with your request to our recruiting team at careers@recordedfuture.com 

Recorded Future is an equal opportunity and affirmative action employer and we encourage candidates from all backgrounds to apply. Recorded Future does not discriminate based on race, religion, color, national origin, gender including pregnancy, sexual orientation, gender identity, age, marital status, veteran status, disability or any other characteristic protected by law.

Recorded Future will not discharge, discipline or in any other manner discriminate against any employee or applicant for employment because such employee or applicant has inquired about, discussed, or disclosed the compensation of the employee or applicant or another employee or applicant.

Tags: Computer Science Data analysis Elasticsearch Git Machine Learning ML models MongoDB NLP Open Source Python PyTorch R R&D Scala spaCy Statistics

Perks/benefits: Career development Startup environment Team events

Regions: Remote/Anywhere Europe
Country: Sweden
Job stats:  17  2  0
Category: Leadership Jobs

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.