Data Engineer (Web Scraping), Digital Services
Hyderabad, India
Ninja Van
Ninja Van is Southeast Asia’s leading logistics provider, with the highest service coverage over 6 countries in the region. Experience the joy of hassle-free deliveries by shipping with Ninja Van today.
Ninja Van is a tech-enabled logistics company on a mission to provide hassle-free delivery services for businesses of all sizes across Southeast Asia. Launched in 2014, we started operations in Singapore and have become the region's largest and fastest growing last-mile logistics company, partnering with over 35,000 merchants and delivering more than 1,000 parcels every minute across six countries.
At our core, we are a technology company that is disrupting a massive industry with cutting-edge software and operational concepts. Powered by algorithm-based optimisation, dynamic routing, end-to-end tracking and a data-driven approach, we provide best-of-class delivery services that delight both the shippers and end customers. But we are just getting started! We have much room for improvement and many ideas that will further shape the industry.
As a Web Scraping focused Data Engineer, you will be responsible for extracting and ingesting data from websites using web crawling tools. In this role you will own the creation process of these tools, services, and workflows to improve crawl / scrape analysis, reports and data management. We will rely on you to test the scraped data to ensure accuracy and quality. You will own the process to identify and rectify any issues with scraping failures as well as scaling the scrapes as needed.
At our core, we are a technology company that is disrupting a massive industry with cutting-edge software and operational concepts. Powered by algorithm-based optimisation, dynamic routing, end-to-end tracking and a data-driven approach, we provide best-of-class delivery services that delight both the shippers and end customers. But we are just getting started! We have much room for improvement and many ideas that will further shape the industry.
As a Web Scraping focused Data Engineer, you will be responsible for extracting and ingesting data from websites using web crawling tools. In this role you will own the creation process of these tools, services, and workflows to improve crawl / scrape analysis, reports and data management. We will rely on you to test the scraped data to ensure accuracy and quality. You will own the process to identify and rectify any issues with scraping failures as well as scaling the scrapes as needed.
Requirements
- Experience running large scale web scrapes
- Solid Python knowledge
- Familiarity with Linux/UNIX, HTTP, HTML, Javascript and Networking
- Familiarity with techniques and tools for crawling, extracting and processing data (e.g. Scrapy, Pandas, Mapreduce, SQL, BeautifulSoup, Selenium, etc).
- Experience with system monitoring/administration tools
- Experience with version control, open source practices, and code review
- Experience with applications designed to display archived web content
- Great communication skills (written and Spoken in English)
- Bachelor's Degree in Computer Science or a related field or the equivalent demonstrated experience
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Computer Science Data management JavaScript Linux Open Source Pandas Privacy Python SQL
Region:
Asia/Pacific
Country:
India
Job stats:
14
6
0
Category:
Engineering Jobs
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open MLOps Engineer jobs
- Open Lead Data Analyst jobs
- Open Data Science Manager jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Manager jobs
- Open Data Engineer II jobs
- Open Principal Data Engineer jobs
- Open Power BI Developer jobs
- Open Sr Data Engineer jobs
- Open Business Intelligence Developer jobs
- Open Junior Data Scientist jobs
- Open Data Analytics Engineer jobs
- Open Product Data Analyst jobs
- Open Data Scientist II jobs
- Open Sr. Data Scientist jobs
- Open Business Data Analyst jobs
- Open Senior Data Architect jobs
- Open Data Analyst Intern jobs
- Open Big Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Data Quality Analyst jobs
- Open Research Scientist jobs
- Open Junior Data Engineer jobs
- Open GCP-related jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open ML models-related jobs
- Open Java-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open Deep Learning-related jobs
- Open PhD-related jobs
- Open APIs-related jobs
- Open TensorFlow-related jobs
- Open PyTorch-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open CI/CD-related jobs
- Open LLMs-related jobs
- Open Kubernetes-related jobs
- Open Generative AI-related jobs
- Open Data governance-related jobs
- Open Hadoop-related jobs
- Open Airflow-related jobs
- Open Docker-related jobs