Technical Lead (Data Engineering) | Remote-Friendly
Pune, Maharashtra, India - Remote
Velotio
Velotio Technologies is a leading product engineering & digital solutions company for innovative startups and enterprises. Velotio has worked with over 90 global customers, including NASDAQ-listed enterprises and unicorn startups. We specialize...At Velotio, we are embracing a remote-friendly work culture where everyone has the flexibility to either work remotely or from our office in Pune.
Join us and work from wherever you feel most productive!
About Velotio:
Velotio Technologies is a product engineering company working with innovative startups and enterprises. We are a certified Great Place to Work® and recognized as one of the best companies to work for in India. We have provided full-stack product development for 110+ startups across the globe building products in the cloud-native, data engineering, B2B SaaS, IoT & Machine Learning space. Our team of 220+ elite software engineers solves hard technical problems while transforming customer ideas into successful products.
As a Technical Lead - Data Engineer, you'll contribute to the design and development of Data Analytics platform using latest tools and cloud technologies for a variety of workloads including real time analytics and batch data. You will also lead a team of 2-10 engineers.
Requirements
- Design and build scalable data infrastructure with efficiency, reliability, and consistency to meet rapidly growing data needs
- Build the applications required for optimal extraction, cleaning, transformation, and loading data from disparate data sources and formats using the latest big data technologies
- Building ETL/ELT pipelines and work with other data infrastructure components, like Data Lakes, Data Warehouses and BI/reporting/analytics tools
- Work with various cloud services like AWS, GCP, Azure to implement highly available, horizontally scalable data processing and storage systems and automate manual processes and workflows
- Implement processes and systems to monitor data quality, to ensure data is always accurate, reliable, and available for the stakeholders and other business processes that depend on it
- Work closely with different business units and engineering teams to develop a long-term data platform architecture strategy and thus foster data-driven decision-making practices across the organisation
- Help establish and maintain a high level of operational excellence in data engineering
- Evaluate, integrate, and build tools to accelerate Data Engineering, Data Science, Business Intelligence, Reporting, and Analytics as needed
- Focus on building test-driven development by writing unit/integration tests
- Contribute to design documents and engineering wiki
You will enjoy this role if you...
- Like building elegant well-architected software products with enterprise customers
- Want to learn to leverage public cloud services & cutting-edge big data technologies, like Spark, Airflow, Hadoop, Snowflake, and Redshift
- Work collaboratively as part of a close-knit team of geeks, architects, and leads
Desired Skills & Experience:
- 4+ years of data engineering or equivalent knowledge and ability
- 4+ years software engineering or equivalent knowledge and ability
- Strong proficiency in at least one of the following programming languages: Python, Scala, or Java
- Experience designing and maintaining at least one type of database (Object Store, Columnar, In-memory, Relational, Tabular, Key-Value Store, Triple-store, Tuple-store, Graph, and other related database types)
- Good understanding of star/snowflake schema designs
- Extensive experience working with big data technologies like Spark, Hadoop, Hive
- Experience building ETL/ELT pipelines and working on other data infrastructure components like BI/reporting/analytics tools
- Experience working with workflow orchestration tools like Apache Airflow, Oozie, Azkaban, NiFi, Airbyte, etc.
- Experience building production-grade data backup/restore strategies and disaster recovery solutions
- Hands-on experience with implementing batch and stream data processing applications using technologies like AWS DMS, Apache Flink, Apache Spark, AWS Kinesis, Kafka, etc.
- Knowledge of best practices in developing and deploying applications that are highly available and scalable
- Experience with or knowledge of Agile Software Development methodologies
- Excellent problem-solving and troubleshooting skills
- Process-oriented with excellent documentation skills
Bonus points if you:
- Have hands-on experience using one or multiple cloud service providers like AWS, GCP, Azure and have worked with specific products like EMR, Glue, DataProc, DataBricks, DataStudio, etc
- Have hands-on experience working with either Redshift, Snowflake, BigQuery, Azure Synapse, or Athena and understand the inner workings of these cloud storage systems
- Have experience building DataLakes, scalable data warehouses, and DataMarts
- Have familiarity with tools like Jupyter Notebooks, Pandas, NumPy, SciPy, sci-kit learn, Seaborn, SparkML, etc.
- Have experience building and deploying Machine Learning models to production at scale
- Possess excellent cross-functional collaboration and communication skills
Benefits
Our Culture:
- We have an autonomous and empowered work culture encouraging individuals to take ownership and grow quickly.
- Flat hierarchy with fast decision making and a startup-oriented “get things done” culture.
- A strong, fun & positive environment with regular celebrations of our success. We pride ourselves in creating an inclusive, diverse & authentic environment.
We want to hire smart, curious and ambitious folks so please reach out even if you do not have all of the requisite experience. We are looking for engineers with the potential to grow!
Note: Currently, all interviews and onboarding processes at Velotio are being carried out remotely through virtual meetings.
* Salary range is an estimate based on our salary survey 💰
Tags: Agile Airflow Architecture Athena AWS Azkaban Azure Big Data BigQuery Business Intelligence Data Analytics Databricks Dataproc Data quality ELT Engineering ETL Flink GCP Hadoop Jupyter Kafka Kinesis Machine Learning ML models NumPy Oozie Pandas Pipelines Python Redshift Scala SciPy Seaborn Snowflake Spark SparkML TDD
Perks/benefits: Career development Flat hierarchy Startup environment
More jobs like this
Explore more AI/ML/Data Science career opportunities
Find open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general, filtered by job title or popular skill, toolset and products used.
- Open Power BI Developer jobs
- Open Junior Data Scientist jobs
- Open Data Engineer (Remote) jobs
- Open Data Analytics Engineer jobs
- Open Director, Data Engineering jobs
- Open Senior Data Analyst (Bangkok Based, relocation provided) jobs
- Open Staff Data Scientist jobs
- Open Junior Data Engineer jobs
- Open Marketing Data Analyst jobs
- Open Product Data Analyst jobs
- Open Lead Data Analyst jobs
- Open Head of Data Science jobs
- Open Principal Data Scientist jobs
- Open Big Data Engineer jobs
- Open Data Manager jobs
- Open BI Analyst jobs
- Open Computer Vision Engineer jobs
- Open Senior Data Architect jobs
- Open Machine Learning Scientist jobs
- Open Associate Data Analyst- Customer Experience Group | Bangkok-based jobs
- Open Data Analyst (Statistics/Python/BI) (Bangkok-based, relocation provided) jobs
- Open Cloud Data Engineer jobs
- Open Data Analyst, Partner Development - (Statistics/ML/BI) (Bangkok-based, relocation provided) jobs
- Open Sr Data Engineer jobs
- Open Senior Data Analyst, Partner Development - (Statistics/ML/BI) (Bangkok-based, relocation provided) jobs
- Open Power BI-related jobs
- Open Consulting-related jobs
- Open Business Intelligence-related jobs
- Open APIs-related jobs
- Open Data visualization-related jobs
- Open Hadoop-related jobs
- Open Data management-related jobs
- Open Data quality-related jobs
- Open ML models-related jobs
- Open Airflow-related jobs
- Open Finance-related jobs
- Open Privacy-related jobs
- Open Scala-related jobs
- Open Snowflake-related jobs
- Open Deep Learning-related jobs
- Open Kafka-related jobs
- Open Data warehouse-related jobs
- Open PhD-related jobs
- Open Streaming-related jobs
- Open Git-related jobs
- Open NoSQL-related jobs
- Open CI/CD-related jobs
- Open Docker-related jobs
- Open DevOps-related jobs
- Open Kubernetes-related jobs