Senior Data Engineer

Warsaw, Masovian Voivodeship, Poland

Applications have closed

Data Engineer for the Computer Vision Team

Sportradar, a global leader in understanding and leveraging the power of sports data for hundreds of business customers around the world, seeks a talented Data Engineer to join our diverse, highly skilled, and super enthusiastic Computer Vision Team and support the team with building the extensive datasets needed for training deep learning and computer vision models.

About the role

The data engineering team plays a critical role in Computer Vision (CV) development at Sportradar. The team provides data that power Computer Vision and Deep Learning models that are key to our strategic direction. The data engineering team partners closely with product and technical leads from CV and AI and provides the data “fuel” necessary to develop some of the leading edge technology at Sportradar. As a Data Engineer on the Data Engineering Team, you will be responsible for and have the engineering skills to make vast datasets, and make them more useful and accessible for CV Engineers as well as support and coordinate data labeling within our huge team of data annotators for one or more teams within our Computer Vision Tribe. Are you ready for this new challenge?

Some of the ongoing projects you will be involved in are:

  • Sourcing, transforming, and analyzing data from various systems and making it more useful and accessible for Computer Vision and Deep Learning training.
  • Participating in the design and development of the tools, systems and infrastructure for labelling solutions and data management that enable us to create vast datasets in the field of sports.
  • Coordinating with the data annotators that work on data labelling campaigns based on requirements and specifications set by Computer Vision and Deep Learning Engineers, and provide the oversight needed to ensure the highest quality of the datasets needed for model development.

Who are we looking for?

You are a talented data engineer who enjoys working with data in every conceivable way. Data is your passion! You also like challenges, strive for continuous improvement and are eager to learn a new thing every day. On top of being great with data, you are also great at talking with stakeholders, collecting requirements and can coordinate small-size projects that involve internal and external stakeholders. You enjoy collaborating with a diverse group of people and are passionate about what you do. Besides all of that you are not afraid to go out of your comfort zone.

THE CHALLENGE:

  • Analyze raw data, develop, and maintain datasets and improve data quality and efficiency – doing that you will support our common mission to develop best-in-class Computer Vision and Deep Learning models.
  • Identify opportunities for data acquisition, combine raw information from different sources and continuously explore ways to enhance data quality and reliability.
  • A vital position in coordinating and scoping out projects in a highly cross-functional environment with internal (e.g. ML/CV Engineers & POs) and external stakeholders (Data Labeling partner companies & individual annotators).
  • Work closely with the Annotations Squad and Squad(s) you work with, and proactively propose strategies for robust data collection at scale, contributing to continuous improvement of the data labeling and annotation efforts.
  • Maintain in-depth understanding of relevant data engineering best practices.
  • Prepare datasets and their systems for machine learning, analytics, and reporting use cases.
  • Perform ad-hoc data analyses, support data scientists and analysts with their data requirements.



Requirements

Professional & PERSONAL Requirements:

  • University degree in Computer Science, Data Science or related field.
  • Strong analytical and problem-solving skills and the ability to combine data from different sources.
  • At least 2 years of related professional experience.
  • Experience in conducting statistical analysis.
  • Advanced knowledge of SQL.
  • Proficiency with Python as well as knowledge of other programming languages; Scala, Go will be considered as a plus.
  • Experience with data warehousing / big data query engines / technologies such as; Spark, Flask, Hadoop, Kafka, MongoDB; DevOps: Linux, Bash, Git, Docker.
  • Some experience with AWS Cloud: S3, EMR, Athena, Glue, EC2, Batch; other cloud-based technologies are a plus.
  • Good understanding of data structures.
  • Some project management skills (able to interface with stakeholders, maintain schedule, manage a workflow ...).
  • Basic understanding of machine learning methods.
  • Detail-oriented and love working with big datasets.
  • Autonomous, creative, and a team player.
  • Fluent in English (written and spoken).

BONUS SUPERPOWERS:

  • PhD or master’s degree in Computer Science, Data Science or related field.
  • Hands-on experience in creating algorithms for data analysis.
  • Experience in successful managing of data labelling campaigns in the context of machine learning.

Benefits

OUR OFFER:

  • Competitive salary and benefits.
  • Opportunity to work and grow in a dynamic tech environment within an inspiring and fast-growing company.
  • A collaborative environment of talented and passionate colleagues from all over the world (Engineering offices in Europe, Asia and US) delivering innovative products.
  • A hybrid working model.

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Athena AWS Big Data Computer Science Computer Vision Data analysis Data management Data quality Data Warehousing Deep Learning DevOps Docker EC2 Engineering Flask Git Hadoop Kafka Linux Machine Learning ML models MongoDB PhD Python Scala Spark SQL Statistics

Perks/benefits: Career development Competitive pay

Region: Europe
Country: Poland
Job stats:  5  0  0
Category: Engineering Jobs

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.