Senior Python Data Engineer
Lisboa, Portugal
Daltix
Daltix provides quality retail FMCG data. Rely on our price, promotion and product data to save resources, monitor competition and easily analyse insights.Daltix is enabling retailers & suppliers to make decisions based on data rather than gut-feeling and for that it’s built up significant experience in how to collect data but also how to transform it in order to support decision making.
We scrape around 3TB of compressed data per month (20TB uncompressed), if you'd like to learn how this is done and the challenges that comes with that, here's your chance! To this end, we’re looking for a Senior Python Data Engineer, who’ll aid some of the biggest names in the industry in becoming truly data-driven (don’t take our word for it, check our website).
You will join our data teams who are in charge of standardizing and extracting information from the data we collect, as well as making it accessible for analytics & reporting. Your responsibilities will involve both Data Engineering and Data Analysis skills.
You will aid us with:
Adding new data processing modules to our pipeline so we can standardise data collected from the web.
Managing the infrastructure (schedulers, computing frameworks) used for our big scale data processing & reporting.
Quality Assurance of our data. We have some tooling in place already, but some more might be necessary. You will likely want to automate some of these checks.
Assist with existing ETL pipelines that make our data ready to use by our customers.
Enabling our professional services team by providing Python based toolkits to make their jobs easier.
What Daltix offers:
Private health insurance, a solid laptop (MacBook, Linux-friendly or Windows - it's up to you) and a lot of flexibility!
The opportunity for you to work only 4 days a week, if that's what you prefer!
Central based office located near São Sebastião Metro Station. We are a remote working friendly company. At the moment, we are working 100% remotely until the end of 2021 due to the pandemic. Afterwards, we will continue to adopt a hybrid model.
Your future colleagues will be Nelson Torres (Data Engineer), and Miguel Almeida (Data Scientist). Your team leader will be Manuel Garrido (Data Architect).
Work with a modern tech stack including: Python, Docker, Terraform, AWS (S3, Batch), Grafana, Airflow, Snowflake & Looker.
Best practices for software engineering including mandatory code reviews, unit tests and benchmarks running on every commit, infrastructure-as-code, among others. We're not where we want to be yet, so there's room to add your touch here.
Squad rotations, allowing you to spend some time per week doing work with another team and learn more about the challenges other colleagues are facing.
Requirements
At least 5 years of relevant working experience in Data Engineering; we also value knowledge of Data Analysis, however most of the tasks at first will be Data Engineering.
- Must-have tech experience (we use everyday):
Python, as 99% of our stack is in Python
SQL
Git
Bash
VIM (nah just kidding, it's the best one though)
- Nice-to-have tech experience (we use everyday so it's helpful if you know them too):
Pandas + Jupyter notebook
Regex
CI / CD
Docker
Cloud experience (AWS preferred)
The application process involves a technical challenge. Interviews and the challenge will be conducted remotely.
We communicate exclusively in English, so fluent technical English is mandatory. Portuguese is not required. Most of our data is in Dutch, so knowledge of Dutch is a plus.
Tags: Airflow AWS Data analysis Docker Engineering ETL Git Grafana Jupyter Linux Looker Pandas Pipelines Python Snowflake SQL Terraform
Perks/benefits: Gear Health care
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Lead Data Analyst jobs
- Open Data Science Manager jobs
- Open MLOps Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Engineer II jobs
- Open Sr Data Engineer jobs
- Open Data Manager jobs
- Open Principal Data Engineer jobs
- Open Data Analytics Engineer jobs
- Open Power BI Developer jobs
- Open Product Data Analyst jobs
- Open Junior Data Scientist jobs
- Open Business Intelligence Developer jobs
- Open Data Scientist II jobs
- Open Senior Data Architect jobs
- Open Sr. Data Scientist jobs
- Open Manager, Data Engineering jobs
- Open Business Data Analyst jobs
- Open Big Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Data Quality Analyst jobs
- Open Principal Data Scientist jobs
- Open Data Product Manager jobs
- Open Azure Data Engineer jobs
- Open Junior Data Engineer jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open GCP-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Java-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open TensorFlow-related jobs
- Open PhD-related jobs
- Open CI/CD-related jobs
- Open NLP-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open Databricks-related jobs
- Open LLMs-related jobs
- Open Data warehouse-related jobs