Senior Software Engineer (Big data/AI) - Remote
Posted 8 months ago
Founded in 2010, Scrapinghub is a fast growing and diverse technology business turning web content into useful data with a cloud-based web crawling platform, off-the-shelf datasets, and turn-key web scraping services.
We’re a globally distributed team of over 180 Shubbers working from over 30 countries who are passionate about scraping, web crawling, and data science.
About the Job:
Scrapinghub is looking for a Senior Backend Engineer to develop and grow a new web crawling and extraction SaaS.
The new SaaS will include our recently released AutoExtract which provides an API for automated e-commerce and article extraction from web pages using Machine Learning. AutoExtract is a distributed application written in Java, Scala and Python; components communicate via Apache Kafka and HTTP, and orchestrated using Kubernetes.
You will be designing and implementing distributed systems: large-scale web crawling platform, integrating Deep Learning based web data extraction components, working on queue algorithms, large datasets, creating a development platform for other company departments, etc. - this is going to be a challenging journey for any backend engineer!
As a Senior Backend Engineer, you will have a large impact on the system we’re building, the new SaaS is still in the early stages of development.
- Work on the core platform: develop and troubleshoot Kafka-based distributed application, write and change components implemented in Java, Scala and Python.
- Work on new features, including design and implementation. You should be able to own and be responsible for the complete lifecycle of your features and code.
- Solve distributed systems problems, such as scalability, transparency, failure handling, security, multi-tenancy.
- 3+ years of experience building large scale data processing systems or high load services
- Strong background in algorithms and data structures.
- Strong track record in at least two of these technologies: Java, Scala, Python.
- 3+ years of experience with at least one of them.
- Experience working with Linux and Docker.
- Good communication skills in English.
- Computer Science or other engineering degree.
Bonus points for:
- Kubernetes experience
- Apache Kafka experience
- Experience building event-driven architectures
- Understanding of web browser internals
- Good knowledge of at least one RDBMS.
- Knowledge of today’s cloud provider offerings: GCP, Amazon AWS, etc.
- Web data extraction experience: web crawling, web scraping.
- Experience with web data processing tasks: finding similar items, mining data streams, link analysis, etc.
- History of open source contributions
As a new Shubber, you will:
Become part of a self-motivated, progressive, multi-cultural team.
Have the freedom and flexibility to work from where you do your best work.
Attend conferences and meet with team members from across the globe.
Work with cutting-edge open source technologies and tools.
Receive paid time off
Enrol in Scrapinghub's Share Option Programme