Data Engineer (Data Science Hub)
Warszawa, Kraków, Poznań, Toruń, Wrocław, Katowice, Lublin, Gdańsk, Łódź, Poland
Opis oferty pracy
The salary range for this position (mid) is 12 300 - 17 600 PLN in gross terms (contract of employment)
The Data Science Hub (DSH) is where we solve various business problems using analytical techniques and machine learning. We deliver insights and make decisions based on terabytes of data processed on a daily basis. Our team is a great place for people who seek continuous development opportunities and a unique chance to acquire interdisciplinary knowledge about how e-commerce platforms work. The variety of impacted business domains is best described by a diverse portfolio of projects, including:
- buyer and merchant - churn prediction
- logistics - delivery time prediction, logistic network optimization
- marketing - category recommendation, next purchase prediction
- pricing - price optimization
- finance - sales forecasting
- and many more…
The Data Science Hub consists of 4 teams:
- 2 Data Science teams
- Analytics team
- Data Engineering team
And we are looking for new members for the Data Engineering team where we focus on the data processing and preparation, deployment and maintenance of our projects and sharing our skills with the rest of the Data Science Hub in order to provide the engineering excellence for the Data Science Hub.
Join our team to enhance your skills related to deploying state-of-the-art data processing techniques and MLOps Machine Learning approaches.
We are looking for people who:
- Have ability to fluently work with SQL in traditional engines (e.g. MySQL, PostgreSQL) or cloud engines (e.g. BigQuery, Snowflake). You will be working with SQL on a daily basis.
- Have experience in Python programming and are familiar with software engineering best practices (PEP8, clean architecture, code review, CI/CD etc.)
- Have positive attitude and ability to work in a team
- Are eager to constantly develop and broaden their knowledge
An additional advantage would be:
- Experience with Big Data ecosystem (Spark, Airflow)
- Knowledge of BigData tools in Google Cloud Platform or other public cloud (e.g AWS, Azure)
- Commercial experience in DevOps and CI/CD practice (e.g. GitHub Actions) in the area of ML/AI
- Experience with cloud applications architecture
Our techstack:
- Python (currently 3.7, migrating to newer version)
- Google Cloud Platform (AirFlow, BigQuery, Composer)
- GitHub (code storage, CI/CD, hosting our own Data Science Python library)
What we offer:
- A hybrid work model that you will agree on with your leader and the team. We have well-located offices (with fully equipped kitchens and bicycle parking facilities) and excellent working tools (height-adjustable desks, interactive conference rooms)
- Annual bonus up to 10% of the annual salary gross (depending on your annual assessment and the company's results)
- A wide selection of fringe benefits in a cafeteria plan – you choose what you like (e.g. medical, sports or lunch packages, insurance, purchase vouchers)
- English classes that we pay for related to the specific nature of your job
- Working in a team you can always count on — we have on board top-class specialists and experts in their areas of expertise
- A high degree of autonomy in terms of organizing your team’s work; we encourage you to develop continuously and try out new thing
- Hackathons, team tourism, training budget and an internal educational platform, MindUp (including training courses on work organization, means of communications, motivation to work and various technologies and subject-matter issues)
What will your responsibilities be?
- You will be actively responsible for building data processing tools for modeling and analysis – in close cooperation with both Data Science teams
- You will be supporting both Data Science teams in the development of data sources for ad-hoc analyses and Machine Learning projects
- You will process terabytes of data using Google Cloud Platform BigQuery, Composer, Dataflow and PySpark as well as optimize processes in terms of their performance and GCP cloud processing costs
- You will collect process requirements from project groups and automate tasks related to preprocessing and data quality monitoring, prediction serving, as well as Machine Learning model monitoring and retraining
- You will be responsible for the engineering quality of each project and you will cooperate with your colleagues on the engineering excellence
Why is it worth working with us?
- Through the supplied data and processes, you will have a meaningful impact on the operation of one of the largest e-commerce platforms in the world
- Thanks to the wide range of projects we are involved in, you will never be without an interesting challenge to take on
- You will have access to vast datasets (measured in petabytes)
- You will get a chance to work in a team of experienced engineers and BigData specialists who are willing to share their knowledge (incl. with the general public, as part of allegro.tech)
- Your professional growth will follow the most recent open-source technological trends
- You will have an actual impact on the directions of product development and on the selection of particular technologies – we use the most recent and best technological solutions available, because we align them closely with our needs
- We are a full-stack provider – we design, code, test, deploy and maintain our solutions
Apply to Allegro and see why it is #dobrzetubyć (#goodtobehere)
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow Architecture AWS Azure Big Data BigQuery CI/CD Dataflow Data quality DevOps E-commerce Engineering Finance GCP GitHub Google Cloud Machine Learning MLOps MySQL Open Source PostgreSQL PySpark Python Snowflake Spark SQL
Perks/benefits: Career development Lunch / meals Salary bonus Startup environment
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Marketing Data Analyst jobs
- Open MLOps Engineer jobs
- Open Junior Data Scientist jobs
- Open Data Engineer II jobs
- Open AI Engineer jobs
- Open Senior Data Architect jobs
- Open Power BI Developer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Analytics Engineer jobs
- Open Sr Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Principal Data Engineer jobs
- Open Business Data Analyst jobs
- Open Product Data Analyst jobs
- Open Data Quality Analyst jobs
- Open Data Manager jobs
- Open Sr. Data Scientist jobs
- Open Big Data Engineer jobs
- Open Data Scientist II jobs
- Open Business Intelligence Developer jobs
- Open Data Analyst Intern jobs
- Open ETL Developer jobs
- Open Principal Data Scientist jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Business Intelligence-related jobs
- Open Data quality-related jobs
- Open Privacy-related jobs
- Open Data management-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open Deep Learning-related jobs
- Open APIs-related jobs
- Open PyTorch-related jobs
- Open Consulting-related jobs
- Open PhD-related jobs
- Open TensorFlow-related jobs
- Open Snowflake-related jobs
- Open NLP-related jobs
- Open Data governance-related jobs
- Open Data warehouse-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open Databricks-related jobs
- Open LLMs-related jobs
- Open DevOps-related jobs
- Open CI/CD-related jobs