(Lead) NLP Data Scientist / ML Engineer, RegBrain
Remote, UK
Applications have closed
CUBE
We're a global regtech engineering a movement to transform regulatory data into regulatory intelligence. đ¤đđ¤ AI at CUBE
CUBE uses AI and NLP to machine read the regulatory internet, at global scale. We collect, clean, standardise, translate, monitor, classify, and enrich regulatory data across 180 countries in over 60 languages. All in near real-time.
We've even built our own ontology of regulationâmachine-driven and continuously refined by a team of subject matter experts.
On a high level, CUBE uses AI to transform regulatory data into regulatory intelligence. And this is exactly where RegBrain comes in.
đ§ RegBrain
It's always a great time to become a CUBER, but now literally could not be a better time. This year, we are building out the core RegBrain team. RegBrain leverages the 10 years of global regulatory data that our existing AI teams have collected, cleaned, standardised, translated, and classified.
đ The mission: to create the ultimate semantic map of global regulatory data, and to take CUBE's AI to the next level through data learning.
The RegBrain team will be responsible for the end-to-end research, design, and development of both the semantic map and a suite of AI-driven capabilitiesâincluding recommendation systems, prediction, and task automation.
As such, the team will be split into two core areas: research & data science and ML & data engineering. All with an NLP flavour, of course.
â ď¸Â Please note: While we're hiring across a wide range of experience levels over the next 4-6 months, the most immediate open roles are team lead positions (there will be one lead for each subteam). The leads will directly influence the hiring process for the rest of the team. If you are not interested in a lead role but think you'd be a great fit for RegBrain, you can still fill out the application. It's designed to be versatile.
Here are the core responsibilities of each RegBrain subteam. Note that the responsibilities are extremely complementary, to reflect how closely the subteams will work together.
𧏠Research & data science
đ  Core mission: Design ML & NLP prototypes for each RegBrain use case, and own the semantic map of CUBE's regulatory data.
- Prepare, maintain, and refine the semantic map (knowledge graph)Â of CUBE's regulatory data.
- Develop, test, and improve optimal ML & NLP models for each RegBrain use case.
- Present information using data visualisation techniques (especially important for the semantic map).
- Determine additional data sources and how to include them in the pipeline (another team will help with actually adding them).
- Stay up-to-date with ML & NLP research, and experiment with new models and techniques.
đď¸ ML & data engineering
đ  Core mission: Develop the ML & NLP prototypes from the data science team, resulting in APIs that can be consumed by CUBE's core platform.
- Determine the cloud architecture strategy and overall ML & data systems for RegBrain.
- Work closely with other AI engineering and data teams to ingest data from our core platform, our transformation engine, and other sources.
- Improve the efficiency, performance, and scalability of ML & NLP models (this includes data quality, ingestion, loading, cleaning, and processing).
- Improve the efficiency, performance, and scalability of the semantic map.
- Verify that the quality of results in production meets the requirements.
đŞ Core competencies
Just as the responsibilities of the RegBrain subteams overlap, the core competencies we're looking for overlap too. The good news for you is that we will use your preferences and the interview process to collaboratively determine which side of the spectrum you should sit on. The strongest candidates have competencies across both sides (and are as modular as CUBE's core product!).
- End-to-end ML model design and development experience (design is more relevant for the data science team; deploying models to production and performance monitoring are especially important for the engineering team) đ
- Experience with cloud infrastructure for data pipelining and model deployment (more relevant for engineering) âď¸
- Experience with ML platforms, frameworks, and libraries đ
- Experience analysing vast volumes of textual data đ
- Strong familiarity with SQL and NoSQL/graph databases đŚ
- Solid understanding of data structures, data modelling, and software architecture đď¸
- Ability to write clear, robust, and testable code, especially in Python đ
- Strong grasp of data visualisation techniques (for dashboarding, reporting, etc.)Â đ
- A systems thinking approach đ
- A mathematically and statistically-oriented brain đ˘
- A healthy sense of humour (you're going to need it... don't say we didn't warn you đ)
Experience matters. But what is more important than raw number of years of experience is demonstrated proficiency (through GitHub profiles/online portfolios and the interview process itself). Bonus points for Stack Overflow and Kaggle contributions! đŻ
đ Why you'll love RegBrain (& CUBE)
If there is a best time to join RegBrain, it's now. Here are the many reasons why.
đ Immediate global impact. CUBE is a well-established player in regtech (we were around before regtech was even a thing!), and our category-defining product is used by leading financial institutions around the world (including Revolut, Citi, and HSBC). We have an audience across 150 countries, and they love CUBE.
đ˝Â Freedom & flexibility. Think of RegBrain as a fully-funded startup within a scaleup. The first to join will have a blank canvas, a tabula rasa. You'll be able to choose your own tech stack. GCP or AWS or Azure? To Spark or not to Spark? PyTorch or TensorFlow? You decide. As long as you can justify your choices, the rings of Saturn are the limit.
đ Quantity & quality of data. The stage has literally been set: over the past 10 years, the five engineering teams at CUBE have built solid foundations for data collection, transformation, and classification. The RegBrain team will focus solely on learning from this mountain of structure.
đŁď¸Â A rich & complex dataset. The main dataset is not only already structured, but also longitudinal and multilingual. We've tracked changes to regulation over time and built in-house translation models for 60+ languages.
đ Always learning. Part of your job is to stay up-to-date with the latest research, and share your learning with the RegBrain team and other AI teams at CUBE. You'll have a training budget and a conference budget. In the mid-long term, we're aiming to collaborate with universities.
âď¸Â Responsible AI. We will proactively address the inevitable biases that emerge for any AI system. Our Head of Product was trained at the Oxford Internet Institute and has direct connections with ethicists who are influencing the future of AI regulation.
đťÂ Employee-first work-life policy. CUBE went fully remote before the pandemic even hit, because we wanted to define the future of work. As a CUBER, you'll be able to design your home office and choose your own work equipment. Unable to work from home one week, or desperate for in-person interaction with colleagues? No problemâbook a room in a coworking space.
đąÂ Sustainable, customer-driven growth. We are a bootstrapped company funded by customers and strategic private investment. This means that growth is sustainable, and product development is very closely aligned with customer needs.
đ Visa sponsorship if required. We know every single nuance of Skilled Worker visas.
đŚÂ Extremely bespoke hiring process. At CUBE, we're trying to flip hiring on its head: the objective of the process is to create a personalised job description (and title). This page sets the general context. We'll collaboratively determine the best role for you, given your interests, CUBE's needs, and other members of the team.
âąď¸Â Hiring timeline
We know how insufferably long and complicated hiring processes can be. We've been there before.
That's why at CUBE, we aim to compress the hiring timeline to between 5 and 10 days (from the first-round interview to the final round). There's no HR screen, culture fit interview, or coding on a whiteboard. Just high-quality infoflow in both directions. đ
Here's what will happen:
- Online application (link below đ)
- First round video interview with RegBrain's Head of Product (30-45m)
- Second round video interview with our CTO (30-45m)
- Take-home challenge (it'll be fun, we promise, and we won't ask for more than a few hours of your time)
- Final round panel interview, again over video (45-60m)
If you have any questions at this stage, feel free to use the live chat widget on the application page. Otherwise: what are you waiting for? This is your once-in-a-lifetime opportunity to define the future of regulation. The clock is already ticking. đ°ď¸
Tags: APIs AWS Azure Classification Engineering GCP GitHub Machine Learning Model deployment Model design NLP NoSQL Python PyTorch Research Spark SQL TensorFlow
Perks/benefits: Career development Flex vacation Home office stipend Salary bonus Startup environment
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Lead Data Analyst jobs
- Open Data Science Manager jobs
- Open MLOps Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Manager jobs
- Open Data Engineer II jobs
- Open Power BI Developer jobs
- Open Principal Data Engineer jobs
- Open Sr Data Engineer jobs
- Open Data Analytics Engineer jobs
- Open Business Intelligence Developer jobs
- Open Junior Data Scientist jobs
- Open Data Scientist II jobs
- Open Product Data Analyst jobs
- Open Senior Data Architect jobs
- Open Sr. Data Scientist jobs
- Open Business Data Analyst jobs
- Open Big Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Manager, Data Engineering jobs
- Open Azure Data Engineer jobs
- Open Junior Data Engineer jobs
- Open Data Quality Analyst jobs
- Open Data Product Manager jobs
- Open Principal Data Scientist jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open ML models-related jobs
- Open GCP-related jobs
- Open Data management-related jobs
- Open Java-related jobs
- Open Privacy-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open Snowflake-related jobs
- Open Consulting-related jobs
- Open TensorFlow-related jobs
- Open PhD-related jobs
- Open CI/CD-related jobs
- Open NLP-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open LLMs-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open Data warehouse-related jobs
- Open Databricks-related jobs