Data Engineer - Machine Learning team

Zagreb, City of Zagreb, Croatia

ReversingLabs

Software Supply Chain Security, Threat Intelligence, and Threat Analysis Solutions to

View company page

ReversingLabs application security and threat intelligence solutions have become essential to advance cybersecurity around the globe. Funded by our recent Series B investment , we're on a journey to expand and grow by hiring top talent across the security industry.

Every application threatens businesses with new supply chain risks. ReversingLabs is the only company that can dissect any binary at unprecedented speed, scale, and explainability to protect the enterprise end-to-end.

Our mission is to help IT professionals secure their code from supply chain threats. If this sounds interesting and you wondered what the Log4j fuss in the news was all about, you can become a part of our solution that secures the software release process for publishers, and manage third party risk for software buyers. We are seeking extraordinary talent to help forge this transformational journey at ReversingLabs. Your future role as a Data Engineer is extremely important to the success of our solution - secure.software platform for software assurance. This is a game-changing opportunity.

The machine learning team is a part of the static file analysis group in ReversingLabs. The projects include data-driven approaches to file type identification, malware classification, detection of hidden payloads, and extraction of file properties related to security. These projects complement the more conventional static analysis techniques developed by threat analysts and reverse engineers in the group, and aim to be practical for threats present in real-world data distributions. For data, we leverage the existing ReversingLabs file reputation collections and feeds of novel samples received from security industry feeds, threat intelligence, and in-house harvesting. Other than straightforward product-oriented projects, we also engage in more research-oriented endeavors aimed at threat hunting, threat intelligence, and data quality in our file metadata collections. The biggest challenge of the machine learning team is working with complex data and a shifting data distribution. This often requires an investigative mindset and cooperation with threat analysts and reverse engineers to create finely tuned and reliable solutions.

We welcome various backgrounds who are willing to specialize in machine learning, particularly in ML Ops, such as all sorts of data professionals, Python developers or DevOps engineers with strong programming skills. As you will be an integral part of our machine learning team, knowing basic concepts of machine learning and data science beforehand might be beneficial. Knowledge exchange will be mutual so there will be plenty of opportunities to fill in the potential knowledge gaps. Your fellow team members in the Machine learning team will teach you a great lot about machine learning and data science, but you should be able to coach them about the best coding standards and be able to propose improvements to their (mostly) Python code.

Responsibilities

  • Develop new and maintain existing data workflows
  • Maintain the machine learning team’s GitHub repositories (mostly written in Python)
  • Provide expertise and guidance in setting standards, choosing tools, libraries, etc.
  • Perform code reviews
  • Identify bottlenecks and bugs, as well as devise solutions to these problems
  • Automate and maintain procedures for dataset updates
  • Automate and maintain procedures for machine learning models retraining
  • Develop other various automatization scripts in Python and/or Bash

Requirements

  • Experience working with Python
  • Proficiency in NoSQL or relational databases
  • Experience designing data pipelines (familiarity with Airflow is a big plus!)
  • Experience with Automated Software Deployment tools such as Docker
  • Experience in working with a Linux-based OS and Bash
  • Identify bottlenecks and bugs, as well as devise solutions to these problems
  • Interest in machine learning and data science concepts
  • Enthusiasm for teamwork, constant learning, and adapting to new circumstances and cutting-edge technologies
  • Excellent knowledge of the English language


Desirable skills

  • Knowledge of ML Ops principles and best practices
  • Experience with CI/CD pipelines
  • Experience with queueing systems, e.g. RabbitMQ and Kafka
  • Experience in working with data visualization tools such as Grafana or Kibana
  • Experience with SQL databases like PostgreSQL and time series databases like ClickHouse
  • Basic knowledge of machine learning and data science concepts
  • Experience with cloud solutions, e.g. AWS

Benefits

  • Hybrid work options (paid accommodation & transportation to Zagreb during onboarding for remote employees)
  • Flexible working hours
  • Generous compensation and a bonus system based on annual performance
  • Hefty personal education budget and possibility to attend leading conferences and seminars in the field
  • Company library and possibility to order books of choice via Amazon
  • Permanent contract in a fast-growing global company with Fortune 500 & governmental agencies as clients
  • Challenging projects in a dynamic, collaborative team
  • Opportunity to work on innovative solutions in malware analysis & software assurance, crafted in our very own Croatian R&D center
  • Great career advancement opportunities - clear goals & internal promotions
  • Employee referral bonus program: EUR 1,060 net for junior position, EUR 2,123 net for mid to senior positions, and EUR 2,654 net for principal/managerial positions
  • Multisport card, annual health checkup, newborn child allowance, rent-cost, and 3rd pillar pension benefits
  • Wellness Weekends - quarterly, company-wide three day weekend, starting with a company paid Friday off for all employees
  • Fully covered car garage in Radnička for all employees

#LI-MV1

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Airflow AWS CI/CD Classification Data pipelines Data quality Data visualization DevOps Docker GitHub Grafana Kafka Kibana Linux Machine Learning ML models NoSQL Pipelines PostgreSQL Python R RabbitMQ R&D RDBMS Research Security SQL

Perks/benefits: Career development Conferences Flex hours Flex vacation Health care Salary bonus Wellness

Region: Europe
Country: Croatia
Job stats:  22  2  0

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.