Senior Data Engineer - Web scraping
Vienna, Vienna, Austria
Applications have closed
PriceHubble
Leading the development of Data & explainable AI-driven real estate valuations and insights globally.PriceHubble is a PropTech company, set to radically improve the understanding and transparency of real estate markets based on data-driven insights. We aggregate and analyse a wide variety of data, run big data analytics and use state-of-the art machine learning to generate stable and reliable valuations and predictive analytics for the real estate market. We are headquartered in Zürich, with offices in Paris, Berlin, Hamburg, Vienna, Amsterdam and Tokyo. We work on international markets, are backed by world-class investors and treasure a startup environment with low bureaucracy, high autonomy and focus on getting things done.
Your team:
Data at PriceHubble is:
- Made of almost 35 engineers (25% of the company) and growing, focused on Data Engineering, Data Science and Data Acquisition.
- The backbone of any PriceHubble product, by orchestrating dozens of daily and hourly data pipelines and scrapers with millions of geospatial data points
- Working with state-of-the-art technologies such as Spark on k8s, Tensorflow, Kubernetes, Google Cloud Platform, Airflow, Docker, Scrapy.
- An international team with engineers working from Paris, Zurich, Berlin, Hamburg and remotely from EU*
Your responsibilities:
- Development of scrapers
- Automate the deployment of scrapers on a cloud architecture
- Build tools lowering the cost of developing scrapers on a new website
- Define strategies for monitoring the validity of the extracted data
- Improve and widen the team’s arsenal for avoiding banning
- Craft and test strategies to optimize the calls made to a website
- Debug and maintain scrapers
Requirements
Background:
- BSc in Computer Science or equivalent
- At least 2 years of experience in the industry
- An experience in web crawling
- Proficiency in Python and Scrapy
Nice to haves:
- Puppeteer / Splash knowledge
- Familiarity with software engineering best practices (clean code, code review, test-driven development, ...) and version control systems
- Understanding of basic data structures and algorithms
- Familiarity with cloud computing technologies and our tech stack (GCP, Kubernetes, Docker and Airflow)
- JS knowledge
Soft skills:
- You want to work in a fast, high-growth startup environment
- You like beautiful software and not just software that solves a problem
- You like to learn from your colleagues and share your knowledge and experience
- You are comfortable working in English; you have a great read, good spoken command of it
* We are interested in every qualified candidate who is eligible to work in the European Union but we are not able to sponsor visas.
Benefits
🕓 Flexible work hours
👖 Casual dress code
🍏 Free snacks, fruits, coffee, beers, sodas
🍺 Thursday drinks
✈️ Relocation package
📘 L&D program
🏢 Well-located offices
💰 Competitive salary
Tags: Airflow Big Data Computer Science Data Analytics Data pipelines Docker Engineering GCP Google Cloud Kubernetes Machine Learning Pipelines Python Spark TDD TensorFlow
Perks/benefits: Career development Competitive pay Flex hours Relocation support Snacks / Drinks Startup environment
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Marketing Data Analyst jobs
- Open MLOps Engineer jobs
- Open Junior Data Scientist jobs
- Open AI Engineer jobs
- Open Data Engineer II jobs
- Open Senior Data Architect jobs
- Open Power BI Developer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Analytics Engineer jobs
- Open Sr Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Principal Data Engineer jobs
- Open Business Data Analyst jobs
- Open Product Data Analyst jobs
- Open Data Quality Analyst jobs
- Open Data Manager jobs
- Open Sr. Data Scientist jobs
- Open Big Data Engineer jobs
- Open Data Scientist II jobs
- Open Business Intelligence Developer jobs
- Open Data Analyst Intern jobs
- Open ETL Developer jobs
- Open Principal Data Scientist jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Business Intelligence-related jobs
- Open Data quality-related jobs
- Open Privacy-related jobs
- Open Data management-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open Deep Learning-related jobs
- Open APIs-related jobs
- Open PyTorch-related jobs
- Open PhD-related jobs
- Open Consulting-related jobs
- Open TensorFlow-related jobs
- Open Snowflake-related jobs
- Open NLP-related jobs
- Open Data governance-related jobs
- Open Data warehouse-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open Databricks-related jobs
- Open LLMs-related jobs
- Open DevOps-related jobs
- Open CI/CD-related jobs