Data Engineer - Machine Learning Product Catalogue

Poznań, Warsaw, Poland

Allegro

Allegro - Najlepsze ceny oraz gwarancja bezpiecznych zakupów!

View company page

Job Description

The salary range for this position is (contract of employment):

mid: 12 300 - 17 600 PLN in gross terms

senior: 16 100 - 23 200 PLN in gross terms

A hybrid work model that incorporates solutions developed by the leader and the team

We are looking for a Data Engineer with a focus on the data processing and preparation, deployment and maintenance of our ML/data projects. Join our team to enhance your skills related to deploying data-based processes, MLOps Machine Learning approaches and share the skills within the team.

We are looking for people who have:

  • 2+ years hands-on experience in Python and its data processing toolset (pandas, NumPy)
  • Experience in process/solution monitoring
  • Knowledge and experience in processing large datasets with Big Data tools, especially Spark (PySpark)
  • Proficiency in using development tools (git, issue tracking, pull requests, code reviews etc.), familiarity with software engineering best practices (PEP8, code review, documentation, CI/CD, testing, automation etc.)
  • DevOps experience
  • Experience in writing advanced and efficient SQL queries (especially in GCP/BigQuery environment)
  • Experience in working on cloud solutions and architecture (GCP, AWS, Azure)
  • Understanding of AI related concepts (classification vs clustering, modeling, precision/recall metrics, model evaluation etc.) and demonstrated ability to use those metrics to back up assumptions and evaluate outcomes
  • Positive attitude and ability to work in a team
  • Good communication skills and pro-activity in seeking, clarifying and understanding information from end users and stakeholders

An additional advantage would be:

  • Previous experience in building, evaluating or deploying ML/AI-based solutions
  • Knowledge of ML libraries (sklearn, xgboost, lgbm)
  • MLOps practical experience
  • Previous experience with GCP tools for data processing e.g. BigQuery, Dataproc etc. and workflow automation solutions, e.g. Airflow
  • GCP certifications and/or hand-on experience in GCP including ML/AI tools (vertex AI)

Our techstack:

  • Python, BigQuery SQL, Spark
  • Google Cloud Platform (Airflow, BigQuery, Composer)
  • GitHub (code storage, CI/CD, hosting our own Data Science Python library)

What we offer:

  • A hybrid work model that you will agree on with your leader and the team. We have well-located offices (with fully equipped kitchens and bicycle parking facilities) and excellent working tools (height-adjustable desks, interactive conference rooms)
  • Annual bonus up to 10% of the annual salary gross (depending on your annual assessment and the company’s results)
  • A wide selection of fringe benefits in a cafeteria plan – you choose what you like (e.g. medical, sports or lunch packages, insurance, purchase vouchers)
  • English classes that we pay for related to the specific nature of your job
  • Working in a team you can always count on — we have on board top-class specialists and experts in their areas of expertise
  • A high degree of autonomy in terms of organizing your team’s work; we encourage you to develop continuously and try out new things
  • Hackathons, team tourism, training budget and an internal educational platform, MindUp (including training courses on work organization, means of communications, motivation to work and various technologies and subject-matter issues)
  • A 16" or 14" MacBook Pro with M1 processor and, 32GB RAM or a corresponding Dell with Windows (if you don’t like Macs) and other gadgets that you may need

What will your responsibilities be?

  • You will be actively responsible for building data processing tools for modeling, analysis and ML – in close cooperation with Data Science team
  • You will be supporting Data Science team in the development of data sources for ad-hoc analyses and Machine Learning projects
  • You will process terabytes of data using Google Cloud Platform BigQuery, Composer, Dataflow and PySpark as well as optimize processes in terms of their performance and GCP cloud processing costs
  • You will collect process requirements from project groups and automate tasks related to preprocessing and data quality monitoring, prediction serving, as well as Machine Learning model monitoring, alerting and retraining
  • You will be responsible for the engineering quality of each project and you will cooperate with your colleagues on the engineering excellence

Why is it worth working with us?

  • Through the supplied data and processes, you will have a meaningful impact on the operation of one of the largest e-commerce platforms in the world
  • Thanks to the wide range of projects we are involved in, you will never be without an interesting challenge to take on
  • You will have access to vast datasets (measured in petabytes)
  • You will get a chance to work in a team of experienced engineers and BigData specialists who are willing to share their knowledge (incl. with the general public, as part of allegro.tech)
  • Your professional growth will follow the most recent open-source technological trends
  • You will have an actual impact on the directions of product development and on the selection of particular technologies – we use the most recent and best technological solutions available, because we align them closely with our needs
  • We are a full-stack provider – we design, code, test, deploy and maintain our solutions

Send us your CV and learn why it’s #goodtobehere

Apply now Apply later
  • Share this job via
  • or

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Airflow Architecture AWS Azure Big Data BigQuery CI/CD Classification Clustering Dataflow Dataproc Data quality DevOps E-commerce Engineering GCP Git GitHub Google Cloud Machine Learning MLOps NumPy Open Source Pandas PySpark Python Scikit-learn Spark SQL Testing Vertex AI XGBoost

Perks/benefits: Career development Gear Lunch / meals Salary bonus Startup environment

Region: Europe
Country: Poland
Job stats:  5  1  0

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.