Senior Research Engineer (Python)
Prague
Rossum
Automate complex business workflows with Rossum’s AI document processing solution. Reduce manual tasks, increase accuracy, drive efficiency.We use state-of-art AI plus intuitive UI to eliminate useless paperwork and make the whole world go faster.
We are already processing up to hundreds of documents per minute. But this volume grows exponentially and the scope of our product is only slightly slower.
We need our Machine Learning models to constantly get better, which is hugely determined by the data we train them on. We are looking for DataOps Engineer to help us curate our datasets, keep our data pipelines clean, and build an ecosystem where data are accessible to anyone who needs them, be it research, annotation team, or analytics. The more and better data we have, the faster we can teach computers to understand documents, and we will be one more step closer to eliminating that useless paperwork.
YOUR MISSION
- You will design, build, and maintain infrastructure for collecting, filtering, transforming, and delivering data to our Data Annotation and Research teams.
- You will be the creator and caretaker of our Research team’s data pipelines, ensuring a continuous flow of data for analytics and/or training of ML models.
- Cooperate closely with our Research team to align on strategy and the best way on how to streamline the data acquisition and labeling process to make our datasets grow without limit.
- You will make sure our datastores are and remain compliant with various data retention policies (customer SLAs, GDPR, etc.).
- You will establish a foothold for the future evolution of the DataOps team as one of its first hires.
ABOUT YOU
We are all geeks and hackers who like to engineer beautiful systems, all the way to Rossum’s CEO. You will fit right in if:
- You are a Python Engineer who loves data
- You have working knowledge of SQL and/or No-SQL databases and have a good idea of how to make them work at scale.
- You have experience with managing or building ETL pipelines in any shape or form.
- You have experience with workflow management frameworks (e.g. Argo, Airflow, KubeFlow, etc.)
- You have experience with K8s, cloud platforms, and event-driven systems.
- You understand the importance of data in data-driven and machine learning-based systems.
- You love to solve large-scale problems by 3Es: easy to maintain, easy to extend, easy to scale.
- You are used to taking end-to-end responsibility for features – from discovery and design to delivery and deployment.
- You are honest and bullshit-free. You base your opinions on data, but don’t cling to it in the face of good arguments.
OUR TECH STACK
Rossum is a Python company (not uncommon for an AI startup). We use Python for training our AI models, actual data processing, and also for our backend APIs.
- Our backend services are written in Django or Flask.
- We are gradually chipping away loosely coupled microservices from our core components, picking the cases where it improves scaling and reliability.
- From being heavy REST API users, we are moving communication of our internal components towards message queues. We most like connecting our services by RabbitMQ or Apache Pulsar.
- We use PostgreSQL for our primary databases, partitioned to ensure queries are fast enough on our table sizes.
- Our ML models are built with Keras and Tensorflow, and we like to rely on Kubeflow for most training and experiments.
- All our services are deployed in Kubernetes clusters: currently in AWS and Azure.
- Our deployments and releases are 98% based on GitOps, with infrastructure defined as code and managed by GitLab-based CI+CD pipelines.
WHAT WE OFFER (BENEFITS)
We are building a hyper-growth SaaS startup following the best Silicon Valley practices, in Prague.
Stock options
5 weeks of vacation
5 sick days / personal time off
Flexible working hours, hybrid regime of work
Extra two weeks for paternity leave
High end laptop & other necessary tech
English & Czech language lessons on all levels
Tasky snacks, food and beverages in the office
Multisport card to access sports facilities
Referral program
About Us
Rossum (the name comes from Czech writer Karel Čapek’s play “Rossum’s Universal Robots”) is capable of extracting data (from documents) six times faster than the human rate. Last year alone it managed to save companies across a number of sectors over one billion keystrokes, the equivalent of 150 years of human labor. Today, the company automates document communication for customers on five continents and a client roster that includes Siemens, Bosch, Cushman & Wakefield, Veolia, and, here in the Czech Republic, Alza, Kofola, and Mattoni.
After tripling our revenue in 2020 and securing Eastern Europe’s largest-ever Series A funding of $100M in 2021 , we plan to further expand our market share and invest heavily across our Go-to-market teams & our research and development backbone.
Our product is number #1 in its category.
Learn more about Rossum on Expats.cz, Forbes & TechCrunch.
Rossum is an equal opportunity employer. At Rossum we believe human potential is the most powerful force for progress and success we aim for. Therefore we maintain a culture of belonging and treat people with respect and provide equal opportunities for hiring, employment, promotion, termination, compensation etc. Rossum does not discriminate against any job applicant or employee or protected veteran because of race, colour, religion, national origin, sex (including pregnancy, gender identity and sexual orientation), physical or mental disability, age or genetic information.
By submitting your application you acknowledge that Rossum will process your personal data for recruitment purposes and selection procedure. Rossum acts as joint controller together with its affiliates (Rossum Ltd, Rossum Czech Republic s.r.o., Rossum USA Inc. and Rossum Israel Ltd.). More details on Rossum’s privacy policy can be found here.
Tags: Airflow APIs AWS Azure Data pipelines Django ETL Flask GitLab Keras Kubernetes Machine Learning Microservices ML models Pipelines PostgreSQL Pulsar Python R Research REST API SQL TensorFlow
Perks/benefits: Career development Equity Flex hours Flex vacation Snacks / Drinks Startup environment
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open AI Engineer jobs
- Open Data Science Manager jobs
- Open Senior Business Intelligence Analyst jobs
- Open MLOps Engineer jobs
- Open Data Manager jobs
- Open Data Engineer II jobs
- Open Sr Data Engineer jobs
- Open Power BI Developer jobs
- Open Principal Data Engineer jobs
- Open Data Analytics Engineer jobs
- Open Business Intelligence Developer jobs
- Open Data Scientist II jobs
- Open Junior Data Scientist jobs
- Open Product Data Analyst jobs
- Open Senior Data Architect jobs
- Open Business Data Analyst jobs
- Open Sr. Data Scientist jobs
- Open Big Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Manager, Data Engineering jobs
- Open Azure Data Engineer jobs
- Open Junior Data Engineer jobs
- Open Data Product Manager jobs
- Open ETL Developer jobs
- Open Data Quality Analyst jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open Privacy-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open Snowflake-related jobs
- Open Consulting-related jobs
- Open TensorFlow-related jobs
- Open PhD-related jobs
- Open CI/CD-related jobs
- Open Kubernetes-related jobs
- Open NLP-related jobs
- Open Data governance-related jobs
- Open LLMs-related jobs
- Open Airflow-related jobs
- Open Data warehouse-related jobs
- Open Hadoop-related jobs
- Open Databricks-related jobs