Software engineer intern - NLP Rules Extraction Plugin for DSS

Paris, France

Dataiku logo
Dataiku
Apply now Apply later

Posted 1 month ago

Dataiku allows enterprises to create value with their data in a human-centered way while breaking down silos and encouraging collaboration. One of the most unique characteristics of our product, Data Science Studio (DSS), is the breadth of its scope and the fact that it caters both to technical and non-technical users. Through DSS, we aim to empower people through data and democratize data science.
If ... You believe Machine Learning & Big data technologies are the future.You can't help yourself from looking under the hood.You want to work for a startup… but not for one that will disappear before your internship ends.You want to improve your FIFA skills (PS4 edition).You want to learn how to code on a real product codebase with talented developers 
Don't look further! You found your next internship.
Today Natural Language Processing (NLP) techniques are used everywhere, from real-time social media monitoring to customer’s complaints automatic classification. Dataiku is improving its NLP capabilities to cover the whole text analysis pipeline, in particular, business rules.
The objective of this 4 to 6 months internship is to write a plugin recipe which allows business users to define manual rules to tag and extract pieces of information from text documents. The rules may take the form of dictionaries, ontologies and formulas (DSL).

After a period of onboarding, you will:

  • Get familiar with DSS plugin development
  • Study AST, spaCy + Rita DSL, Apache UIMA + RUTA DSL features & limitations
  • Develop the plugin
  • Celebrate and party because you’ve simplified the life of Data Scientists and Data Analysts everywhere.

We’re looking for really talented, smart, kind, and genuinely curious individuals to work alongside us. To succeed, you should:

  • Be autonomous, to drive your subject.
  • Be eager to learn new things.
  • Have a good knowledge of a programming language (Java, Javascript, Python, R, Ruby…).
  • Have a basic knowledge of Web development.
As an intern, you'll join the R&D team of a well known startup. The team is composed of around 50 engineers passionate about software development. The atmosphere is both serious and fun (free lunches every Friday in the kitchen, large screen to play FIFA, …)
Miscellaneous but important details:
The internship will take place in our nice and cozy office located in the heart of Paris.The monthly remuneration will be: . 1000 euros for students in 4th year of university / engineering school. . 1400 euros for students in 5th years of university / engineering school.
Remuneration additionally includes 50% reimbursement of public transit pass.
To fulfill its mission, Dataiku is growing fast! In 2019, we achieved unicorn status, went from 200 to 400 people and opened new offices across the globe. Spanning from Sydney to Frankfurt, Denver to London, geography doesn’t stop Dataikers from working closely together and sharing experiences. Collaboration is key within our product and culture. We strive to create a sense of belonging and community while fostering diverse thinking by encouraging cross-team, cross-office interactions like our annual company offsite or Paris onboarding. Fly over to Twitter, LinkedIn, and Instagram to read stories about our culture, people, and success. 
Our practices are rooted in the idea that everyone should be treated with dignity, decency and fairness. Dataiku also believes that a diverse identity is a source of strength and allows us to optimize across the many dimensions that are needed for our success. Therefore, we are proud to be an equal opportunity employer. All employment practices are based on business needs, without regard to race, ethnicity, gender identity or expression, sexual orientation, religion, age, neurodiversity, disability status, citizenship, veteran status or any other aspect which makes an individual unique or protected by laws and regulations in the locations where we operate. This applies to all policies and procedures related to recruitment and hiring, compensation, benefits, performance, promotion and termination and all other conditions and terms of employment.
Job tags: Big Data Engineering Java JavaScript Machine Learning NLP Python R Ruby spaCy
Share this job: