If you are applying for this role, you are a person who likes to streamline data processes, analyze and uncover insights. A Junior Data Scientist is able to take a standalone, hands on approach to build up Vettons’s big-data stack from data capture & wrangling to modelling and visualization. You will build machine learning models to make predictions and answer key business questions as well as solving complex business problems. A Junior Data Scientist is a collaborative and agile member who contributes to explore new frontiers by pushing the boundaries of technology together.

You can expect to be given ambitious but achievable goals but show quick results by building in short iterations with lots of experimentation.


  • Identify valuable data sources and automate data pipelining processes
  • Undertake preprocessing, cleansing and streamlining of structured and unstructured data
  • Analyze large amounts of information to discover trends and patterns
  • Cluster large amount of user generated content and process data in large-scale environments such as Google Cloud Platform (GCP), Amazon Web Services (AWS), Cloud Native Hadoop.
  • Advocate on integrating diverse data sources into the data lake, manage and provide recommendations to the ETL process.
  • Build predictive models and machine-learning algorithms
  • Use predictive modeling to increase and optimize customer experiences, revenue generation, and other business outcomes
  • Present information using data visualization techniques
  • Propose solutions and strategies to business challenges
  • Collaborate with engineering and product development teams
  • Work with stakeholders throughout the organization to identify opportunities for leveraging company data to drive business solutions as well as to build the roadmap and implement the analytics program to meet the internal product performance analytics requirements.

  • 2+ years of experience as Data Analyst or related role
  • Experience in data mining
  • Hands-on experience on Big Data, Machine Learning tools and big data management frameworks like BigQuery, Dataproc, Amazon EMR, Elasticsearch, etc in Cloud services such as Google Cloud Platform, Amazon Web Services, Microsoft Azure
  • Understanding of machine-learning and operations research
  • Hands-on experience in analytics modeling using programming languages like Python, R and SQL. familiarity with PySpark, Scala, Java or C++ is an asset.
  • Experience in using Jupyter and RStudio notebooks
  • Building models using Supervised and Unsupervised Machine Learning, Modelling using Natural Language Processing (NLP), Recommendation Systems, Probabilistic and Statistical Inference approaches to advance analysis models, Time-Series Analysis, Numerical Optimizations, Hypothesis Testing and hands -on experience on k-NN, Naive Bayes, SVM, Decision Trees, Random Forest, Logistic Regression, Neural Networks and Deep Learning would be an added advantage
  • Experience using business intelligence tools (e.g. Tableau) and data frameworks (e.g. Hadoop)
  • Experience with NoSQL databases like DynamoDB, MongoDB
  • Experience with tools like GitHub, GitLab, Confluence, and Jira would be an added advantage
  • Analytical mind and business acumen
  • Capable math skills (e.g. statistics, algebra)
  • BSc/BA in Computer Science, Engineering or relevant field; graduate degree in Data Science or other quantitative field is preferred
