Python Engineer (NLP team) - Remote, with 4hrs overlap with UK/GMT timezone
Remote - London, England, United Kingdom
Applications have closed
Ohalo
Managing unstructured data just got easy with Ohalo's Data X-Ray. Quickly discover, classify and redact sensitive data in hybrid & multi-cloud environmentsWorking at Ohalo
Ohalo was founded by a group of data professionals in 2017 on the basis that we would create order out of data chaos. Every day Global Enterprises are acquiring more and more data and understanding this data is critical to how these organisations operate. The product that Ohalo have developed - the Data X-Ray is a highly advanced tool that scans through 100,000s of words per second and is able to analyse data to classify and redact data at a scale not achievable by humans. Think saving millions on data monitoring, avoiding regulatory fine risks and being an asset to data privacy.
Location: Fully Remote, but with 4hrs (or more) overlap with GMT timezone
The Role
We’re looking for a Python Engineer with a specialization in Natural Language Processing. Ohalo current NLP application uses Python to annotate vast quantities of text in search of sensitive information for data privacy and data protection purposes. We‘re interested in expanding our NLP capabilities features around:
- Customisable Text Categorization
- Customisable Named Entity Recognition
To succeed in this role, you should possess outstanding skills in statistical analysis, machine learning methods and text representation techniques. Your ultimate goal is to develop efficient NLP applications and analytical pipelines at an enterprise level that can process petabytes of documents in parallel.
You’ll be working closely with the CTO and the Product Manager, as well as the wider engineering team to deliver high-value and high-demand NLP features to enterprise clients.
Responsibilities
- Design NLP features customisable Text Categorization and Named Entity Recognition for our clients
- Work with the product manager and the CTO on defining new features that help the clients understand the precision and recall of the their datasets
- Maintain and improve the efficiency and stability of the existing data pipeline, including Dictionary and Regular Expression matching
- Extend ML libraries and frameworks to apply in NLP tasks
- Build testing framework and run evaluation experiments to ensure the NLP application performs at acceptable throughput levels
- Perform statistical analysis of results and refine our own in house models for benchmarking
- Stay current with the rapidly changing field of NLP & machine learning
Requirements
- Experience developing robust Python applications (REST APIs, gunicorn, pysql)
- Proven experience with NLP frameworks (e.g. Spacy)
- Understanding of NLP techniques for text representation, tokenization, semantic extraction, sentiment analysis, data structures and modeling (such as n-grams, bag of words)
- Ability to write robust and testable code
- Ability to effectively design software architecture
- Strong communication skills
- An analytical mind with a scientific approach to problem-solving
- Bonus if you have experience with machine learning frameworks (like Keras or PyTorch) and libraries (like scikit-learn)
Benefits
💰 Competitive Pay
📈 Meaningful Equity as a fast-growing Series A company
💻 New workstation & home office stipend
🏝 25 Days Paid Vacation (unlimited unpaid)
🏡 Flexible working: remote first, with office space in London/UK if convenient or desired
Tags: APIs Engineering Keras Machine Learning NLP Pipelines Python PyTorch Scikit-learn spaCy Testing
Perks/benefits: Career development Competitive pay Equity Flex hours Flex vacation Home office stipend Salary bonus Unlimited paid time off
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Data Science Manager jobs
- Open Lead Data Analyst jobs
- Open MLOps Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Engineer II jobs
- Open Data Manager jobs
- Open Sr Data Engineer jobs
- Open Power BI Developer jobs
- Open Principal Data Engineer jobs
- Open Data Analytics Engineer jobs
- Open Business Intelligence Developer jobs
- Open Junior Data Scientist jobs
- Open Data Scientist II jobs
- Open Product Data Analyst jobs
- Open Senior Data Architect jobs
- Open Sr. Data Scientist jobs
- Open Business Data Analyst jobs
- Open Big Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Manager, Data Engineering jobs
- Open Azure Data Engineer jobs
- Open Data Quality Analyst jobs
- Open Data Product Manager jobs
- Open Junior Data Engineer jobs
- Open Principal Data Scientist jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open GCP-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Java-related jobs
- Open Privacy-related jobs
- Open Data visualization-related jobs
- Open Finance-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open Snowflake-related jobs
- Open Consulting-related jobs
- Open TensorFlow-related jobs
- Open PhD-related jobs
- Open CI/CD-related jobs
- Open NLP-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open LLMs-related jobs
- Open Databricks-related jobs
- Open Data warehouse-related jobs