Python Engineer (NLP team) - Remote, with 4hrs overlap with UK/GMT timezone

Remote - London, England, United Kingdom

Applications have closed

Ohalo

Managing unstructured data just got easy with Ohalo's Data X-Ray. Quickly discover, classify and redact sensitive data in hybrid & multi-cloud environments

View company page

Working at Ohalo

Ohalo was founded by a group of data professionals in 2017 on the basis that we would create order out of data chaos. Every day Global Enterprises are acquiring more and more data and understanding this data is critical to how these organisations operate. The product that Ohalo have developed - the Data X-Ray is a highly advanced tool that scans through 100,000s of words per second and is able to analyse data to classify and redact data at a scale not achievable by humans. Think saving millions on data monitoring, avoiding regulatory fine risks and being an asset to data privacy.


Location: Fully Remote, but with 4hrs (or more) overlap with GMT timezone


The Role

We’re looking for a Python Engineer with a specialization in Natural Language Processing. Ohalo current NLP application uses Python to annotate vast quantities of text in search of sensitive information for data privacy and data protection purposes. We‘re interested in expanding our NLP capabilities features around:

  • Customisable Text Categorization
  • Customisable Named Entity Recognition

To succeed in this role, you should possess outstanding skills in statistical analysis, machine learning methods and text representation techniques. Your ultimate goal is to develop efficient NLP applications and analytical pipelines at an enterprise level that can process petabytes of documents in parallel.

You’ll be working closely with the CTO and the Product Manager, as well as the wider engineering team to deliver high-value and high-demand NLP features to enterprise clients.


Responsibilities

  • Design NLP features customisable Text Categorization and Named Entity Recognition for our clients
  • Work with the product manager and the CTO on defining new features that help the clients understand the precision and recall of the their datasets
  • Maintain and improve the efficiency and stability of the existing data pipeline, including Dictionary and Regular Expression matching
  • Extend ML libraries and frameworks to apply in NLP tasks
  • Build testing framework and run evaluation experiments to ensure the NLP application performs at acceptable throughput levels
  • Perform statistical analysis of results and refine our own in house models for benchmarking
  • Stay current with the rapidly changing field of NLP & machine learning

Requirements

  • Experience developing robust Python applications (REST APIs, gunicorn, pysql)
  • Proven experience with NLP frameworks (e.g. Spacy)
  • Understanding of NLP techniques for text representation, tokenization, semantic extraction, sentiment analysis, data structures and modeling (such as n-grams, bag of words)
  • Ability to write robust and testable code
  • Ability to effectively design software architecture
  • Strong communication skills
  • An analytical mind with a scientific approach to problem-solving
  • Bonus if you have experience with machine learning frameworks (like Keras or PyTorch) and libraries (like scikit-learn)

Benefits

💰 Competitive Pay

📈 Meaningful Equity as a fast-growing Series A company

💻 New workstation & home office stipend

🏝 25 Days Paid Vacation (unlimited unpaid)

🏡 Flexible working: remote first, with office space in London/UK if convenient or desired

Tags: APIs Engineering Keras Machine Learning NLP Pipelines Python PyTorch Scikit-learn spaCy Testing

Perks/benefits: Career development Competitive pay Equity Flex hours Flex vacation Home office stipend Salary bonus Unlimited paid time off

Regions: Remote/Anywhere Europe
Country: United Kingdom
Job stats:  45  12  1

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.