Senior Research Engineer (Python)

Prague

Applications have closed

Rossum

Automate complex business workflows with Rossum’s AI document processing solution. Reduce manual tasks, increase accuracy, drive efficiency.

View company page

We use state-of-art AI plus intuitive UI to eliminate useless paperwork and make the whole world go faster.

We are already processing up to hundreds of documents per minute. But this volume grows exponentially and the scope of our product is only slightly slower. 

We need our Machine Learning models to constantly get better, which is hugely determined by the data we train them on. We are looking for DataOps Engineer to help us curate our datasets, keep our data pipelines clean, and build an ecosystem where data are accessible to anyone who needs them, be it research, annotation team, or analytics. The more and better data we have, the faster we can teach computers to understand documents, and we will be one more step closer to eliminating that useless paperwork.

YOUR MISSION

  • You will design, build, and maintain infrastructure for collecting, filtering, transforming, and delivering data to our Data Annotation and Research teams.
  • You will be the creator and caretaker of our Research team’s data pipelines, ensuring a continuous flow of data for analytics and/or training of ML models.
  • Cooperate closely with our Research team to align on strategy and the best way on how to streamline the data acquisition and labeling process to make our datasets grow without limit.
  • You will make sure our datastores are and remain compliant with various data retention policies (customer SLAs, GDPR, etc.).
  • You will establish a foothold for the future evolution of the DataOps team as one of its first hires. 

ABOUT YOU

We are all geeks and hackers who like to engineer beautiful systems, all the way to Rossum’s CEO. You will fit right in if:

  • You are a Python Engineer who loves data
  • You have working knowledge of SQL and/or No-SQL databases and have a good idea of how to make them work at scale.
  • You have experience with managing or building ETL pipelines in any shape or form.
  • You have experience with workflow management frameworks (e.g. Argo, Airflow, KubeFlow, etc.)
  • You have experience with K8s, cloud platforms, and event-driven systems.
  • You understand the importance of data in data-driven and machine learning-based systems.
  • You love to solve large-scale problems by 3Es: easy to maintain, easy to extend, easy to scale.
  • You are used to taking end-to-end responsibility for features – from discovery and design to delivery and deployment.
  • You are honest and bullshit-free. You base your opinions on data, but don’t cling to it in the face of good arguments.

OUR TECH STACK

Rossum is a Python company (not uncommon for an AI startup). We use Python for training our AI models, actual data processing, and also for our backend APIs.

  • Our backend services are written in Django or Flask.
  • We are gradually chipping away loosely coupled microservices from our core components, picking the cases where it improves scaling and reliability.
  • From being heavy REST API users, we are moving communication of our internal components towards message queues. We most like connecting our services by RabbitMQ or Apache Pulsar.
  • We use PostgreSQL for our primary databases, partitioned to ensure queries are fast enough on our table sizes.
  • Our ML models are built with Keras and Tensorflow, and we like to rely on Kubeflow for most training and experiments.
  • All our services are deployed in Kubernetes clusters: currently in AWS and Azure.
  • Our deployments and releases are 98% based on GitOps, with infrastructure defined as code and managed by GitLab-based CI+CD pipelines.

WHAT WE OFFER (BENEFITS)

We are building a hyper-growth SaaS startup following the best Silicon Valley practices, in Prague.

Stock options
5 weeks of vacation
5 sick days / personal time off
Flexible working hours, hybrid regime of work
Extra two weeks for paternity leave
High end laptop & other necessary tech
English & Czech language lessons on all levels
Tasky snacks, food and beverages in the office
Multisport card to access sports facilities
Referral program


About Us

Rossum (the name comes from Czech writer Karel Čapek’s play “Rossum’s Universal Robots”) is capable of extracting data (from documents) six times faster than the human rate. Last year alone it managed to save companies across a number of sectors over one billion keystrokes, the equivalent of 150 years of human labor. Today, the company automates document communication for customers on five continents and a client roster that includes Siemens, Bosch, Cushman & Wakefield, Veolia, and, here in the Czech Republic, Alza, Kofola, and Mattoni.

After tripling our revenue in 2020 and securing Eastern Europe’s largest-ever Series A funding of $100M in 2021 , we plan to further expand our market share and invest heavily across our Go-to-market teams & our research and development backbone.

Our product is number #1 in its category. 

Learn more about Rossum on Expats.cz, ForbesTechCrunch.

 

Rossum is an equal opportunity employer. At Rossum we believe human potential is the most powerful force for progress and success we aim for. Therefore we maintain a culture of belonging and treat people with respect and provide equal opportunities for hiring, employment, promotion, termination, compensation etc. Rossum does not discriminate against any job applicant or employee or protected veteran because of race, colour, religion, national origin, sex (including pregnancy, gender identity and sexual orientation), physical or mental disability, age or genetic information.

By submitting your application you acknowledge that Rossum will process your personal data for recruitment purposes and selection procedure. Rossum acts as joint controller together with its affiliates (Rossum Ltd, Rossum Czech Republic s.r.o., Rossum USA Inc. and Rossum Israel Ltd.). More details on Rossum’s privacy policy can be found here.

Tags: Airflow APIs AWS Azure Data pipelines Django ETL Flask GitLab Keras Kubernetes Machine Learning Microservices ML models Pipelines PostgreSQL Pulsar Python R Research REST API SQL TensorFlow

Perks/benefits: Career development Equity Flex hours Flex vacation Snacks / Drinks Startup environment

Region: Europe
Job stats:  7  2  0

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.