Data Engineer
Lahore, Punjab, Pakistan - Remote
Newrich Network
The NewRich Network Network is looking for a part-time Data Engineer (Option to go Full-Time in 6 Months) with experience in building data integrations using AWS and Google Cloud technology stack as part of the team's product portfolio.
We are a company that is 100% focused on delivering digital information and solutions to our customers all around the world. Our dev projects are at the core of our organization – and so is our dev team. We aren’t code monkeys here. We’re creators.
Our dev team is also 100% remote. So if you’re looking for a work-from-home opportunity, and you’re passionate about creating platforms for the new world - we want to hear from you!
Requirements
- Design and build applications that perform data analysis, transformations, aggregations, and other augmentations on large sets of data in a spark-based AWS environment (EMR, S3, Glue, Redshift, Athena)
- Evaluate various pipeline models, tools, and environments and implement these to push data from our sources through your transformations and finally to our customers
- Work with product management and data research teams to prototype and test new ideas then take those to production
- Work in a fast-paced, innovate-and-test environment.
What You'll Do:
- Collaborate with Data architects, Enterprise architects, Solution consultants and Product engineering teams to gather customer data integration requirements, conceptualize solutions & build required technology stack
- Collaborate with enterprise customer's engineering team to identify data sources, profile and quantify quality of data sources, develop tools to prepare data and build data pipelines for integrating customer data sources and third party data sources.
- Develop new features and improve existing data integrations with customer data ecosystem
- Encourage the team to think out-of-the-box and overcome engineering obstacles while incorporating new innovative design principles.
- Collaborate with a Project Manager to bill and forecast time for product owner solutions
- Building data pipelines
- Reconciling missed data
- Acquire datasets that align with business needs
- Develop algorithms to transform data into useful, actionable information
- Build, test, and maintain database pipeline architectures
- Collaborate with management to understand company objectives
- Create new data validation methods and data analysis protocols
- Ensure compliance with data governance and security policies
What You Need To Succeed:
- 4 + years experience in Data Engineering
- Excellent communication and interpersonal skills are a MUST.
- Bachelor’s degree in Computer Science, Engineering or a related discipline - Preferred but not mandatory
- 3+ years of experience working on Apache Spark applications using Python (PySpark) or Scala
- Experience creating spark jobs that work on at least 1 billion records
- Strong knowledge of ETL architecture and standards
- Software development experience working with Apache Airflow, Spark, MongoDB, MySQL
- Strong SQL knowledge
- Strong command of Python
- Experience creating data pipelines in a production system
- Proven experience in building/operating/maintaining fault tolerant and scalable data processing integrations using AWS
- Experience using Docker or Kubernetes is a plus
- Ability to identify and resolve problems associated with production grade large scale data processing workflows
- Experience with crafting and maintaining unit tests and continuous integration.
- Passion for crafting Intelligent data pipelines that teams love to use
- Strong capacity to handle numerous projects are a must
Annual Salary
Ranges from 18,000 USD for part-time
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow Architecture Athena AWS Computer Science Data analysis Data governance Data pipelines Docker Engineering ETL GCP Google Cloud Kubernetes MongoDB MySQL Pipelines PySpark Python Redshift Research Scala Security Spark SQL
Perks/benefits: Flex vacation
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Data Engineering Manager jobs
- Open Data Manager jobs
- Open BI Analyst jobs
- Open Sr. Data Scientist jobs
- Open MLOps Engineer jobs
- Open Business Intelligence Analyst jobs
- Open Data Engineer II jobs
- Open Product Data Analyst jobs
- Open Big Data Engineer jobs
- Open Power BI Developer jobs
- Open Sr Data Engineer jobs
- Open Senior Manager, Data Science jobs
- Open Lead Data Analyst jobs
- Open Director, Data Engineering jobs
- Open Data Analytics Engineer jobs
- Open (Senior) Digital Analytics Engineer jobs
- Open Data Engineer (Remote) jobs
- Open Senior Data Architect jobs
- Open Junior Data Engineer jobs
- Open Business Data Analyst jobs
- Open Principal Data Scientist jobs
- Open Clinical Data Manager jobs
- Open Manager, Data Engineering jobs
- Open Lead Machine Learning Engineer jobs
- Open Research Scientist jobs
- Open Excel-related jobs
- Open Data quality-related jobs
- Open Power BI-related jobs
- Open Privacy-related jobs
- Open Business Intelligence-related jobs
- Open APIs-related jobs
- Open Consulting-related jobs
- Open Deep Learning-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open PyTorch-related jobs
- Open Airflow-related jobs
- Open TensorFlow-related jobs
- Open Data management-related jobs
- Open NLP-related jobs
- Open Kubernetes-related jobs
- Open PhD-related jobs
- Open Scala-related jobs
- Open Kafka-related jobs
- Open Hadoop-related jobs
- Open Snowflake-related jobs
- Open Data warehouse-related jobs
- Open Docker-related jobs
- Open Data governance-related jobs
- Open DevOps-related jobs