Senior Data Engineer - Web scraping

Vienna, Vienna, Austria

PriceHubble

Leading the development of Data & explainable AI-driven real estate valuations and insights globally.

View company page

PriceHubble is a PropTech company, set to radically improve the understanding and transparency of real estate markets based on data-driven insights. We aggregate and analyse a wide variety of data, run big data analytics and use state-of-the art machine learning to generate stable and reliable valuations and predictive analytics for the real estate market. We are headquartered in Zürich, with offices in Paris, Berlin, Hamburg, Vienna, Amsterdam and Tokyo. We work on international markets, are backed by world-class investors and treasure a startup environment with low bureaucracy, high autonomy and focus on getting things done.


Your team:

Data at PriceHubble is:

  • Made of almost 35 engineers (25% of the company) and growing, focused on Data Engineering, Data Science and Data Acquisition.
  • The backbone of any PriceHubble product, by orchestrating dozens of daily and hourly data pipelines and scrapers with millions of geospatial data points
  • Working with state-of-the-art technologies such as Spark on k8s, Tensorflow, Kubernetes, Google Cloud Platform, Airflow, Docker, Scrapy.
  • An international team with engineers working from Paris, Zurich, Berlin, Hamburg and remotely from EU*


Your responsibilities:

  • Development of scrapers
  • Automate the deployment of scrapers on a cloud architecture
  • Build tools lowering the cost of developing scrapers on a new website
  • Define strategies for monitoring the validity of the extracted data
  • Improve and widen the team’s arsenal for avoiding banning
  • Craft and test strategies to optimize the calls made to a website
  • Debug and maintain scrapers

Requirements

Background:

  • BSc in Computer Science or equivalent
  • At least 2 years of experience in the industry
  • An experience in web crawling
  • Proficiency in Python and Scrapy


Nice to haves:

  • Puppeteer / Splash knowledge
  • Familiarity with software engineering best practices (clean code, code review, test-driven development, ...) and version control systems
  • Understanding of basic data structures and algorithms
  • Familiarity with cloud computing technologies and our tech stack (GCP, Kubernetes, Docker and Airflow)
  • JS knowledge

Soft skills:

  • You want to work in a fast, high-growth startup environment
  • You like beautiful software and not just software that solves a problem
  • You like to learn from your colleagues and share your knowledge and experience
  • You are comfortable working in English; you have a great read, good spoken command of it

* We are interested in every qualified candidate who is eligible to work in the European Union but we are not able to sponsor visas.

Benefits

🕓 Flexible work hours

👖 Casual dress code

🍏 Free snacks, fruits, coffee, beers, sodas

🍺 Thursday drinks

✈️ Relocation package

📘 L&D program

🏢 Well-located offices

💰 Competitive salary

Tags: Airflow Big Data Computer Science Data Analytics Data pipelines Docker Engineering GCP Google Cloud Kubernetes Machine Learning Pipelines Python Spark TDD TensorFlow

Perks/benefits: Career development Competitive pay Flex hours Relocation support Snacks / Drinks Startup environment

Region: Europe
Country: Austria
Job stats:  6  0  0
Category: Engineering Jobs

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.