Senior Data Scientist - Data Intelligence Team

Berlin, Berlin, Germany

PriceHubble

Leading the development of Data & explainable AI-driven real estate valuations and insights globally.

View company page

PriceHubble is a PropTech company, set to radically improve the understanding and transparency of real estate markets based on data-driven insights. We aggregate and analyse a wide variety of data, run big data analytics and use state-of-the-art machine learning to generate stable and reliable valuations and predictive analytics for the real estate market. We are headquartered in Zürich, with offices in Paris, Hamburg, Berlin, Amsterdam, Vienna, Prague and Tokyo. We work on international markets. We are backed by world-class investors. We have a startup environment, low bureaucracy and an international team and business.

Your role

Data is at the core of PriceHubble. We process a wide variety of data from multiple sources. As a data scientist in the data-intelligence team, you will have three main missions:

  • First, to augment the data we have via machine learning prediction.
  • Second, to develop techniques to measure, assert, and improve the quality of the data we have.
  • Third, to develop matching algorithms for linking data from heterogeneous sources.


As a senior data scientist, you are highly motivated by the following questions:

  • Before doing standard machine learning, how do I build a strong labeled data set from scratch?
  • Garbage in = garbage out; then how do I measure the quality of labels in a data set? How do I improve upon this when I have very few labels to start with?
  • How can I go from no labels to the point where state of the art Machine-Learning can finally be leveraged?
  • How to plan research projects spanning 3 months to 1 year in a way that structurally mitigates risk?
  • What should be the next steps in a research project? Where should we focus research efforts? The models, the labels/training data, feature-engineering, post-processing, or elsewhere?

These questions are, in our opinion, the new frontier in data science. You will be joining a team that specializes in this topic, with, amongst other, advanced experience in crowd-sourcing, matching problems, ensembles modeling, and statistical estimation. Our technologies and tools are just getting started; feeling excited about it? Want to be part of the adventure? Hop in!

Responsibilities

  • Mentor more junior team members
  • Define roadmap & approaches for research projects
  • Actively mitigate risks in Machine Learning projects, by attacking high risk items first and making sure projects fail fast if likely to hit a structural blocker
  • Apply machine learning methods to augment data-sets
  • Develop and improve models for cross linking heterogeneous data sources together
  • Analyse and detect problems in our estimators
  • Correct blind spots in our data-labelling
  • Deploy, validate, and fine tune crowd-sourcing jobs for acquiring labels

Requirements

  • MSc or PhD in Computer Science, Applied Mathematics or related fields; with a strong experience in machine learning and/or data science.
  • 3-5 years experience in a data-science, research (incl. PhD), or quantitative role
  • In-depth understanding of basic data structures and algorithms.
  • Strong analytical skills with the ability to collect, organise, and analyse significant amounts of data with attention to detail and accuracy.
  • Strong programming experience with Python, and ability to write quality production code.
  • Experience with crowd-sourcing, active-learning, semi-supervised learning, ensemble-modeling or matching problems is a plus.
  • Experience with ETL and data processing tools we’re using is an advantage (pandas, airflow, PySpark).
  • Experience with standard ML frameworks is also a plus (sklearn, tensorflow, pytorch,...)
  • Comfortable working in English; you have a great read, good spoken command of it.

Benefits

On top of joining a team of ambitious, qualified people you may also enjoy our benefits:

🕓 Flexible work hours

💰 Competitive salary

👖 Casual dress code

📘 L&D program

🏢 Well-located offices

🍏 Free snacks, fruits, coffee, beers, sodas



Tags: Airflow Big Data Computer Science Data Analytics Engineering ETL Machine Learning Mathematics Pandas PhD PySpark Python PyTorch Research Scikit-learn TensorFlow

Perks/benefits: Career development Competitive pay Flex hours Snacks / Drinks Startup environment

Region: Europe
Countries: Germany United States
Job stats:  9  0  0
Category: Data Science Jobs

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.