Data Engineer
India Remote
H1
H1 is the convening force for global HCP, clinical, science, and research insights that inform a healthier future. Join the journey.Data Engineering has teams that are responsible for collecting, curating, normalizing and matching data from hundreds of disparate sources from around the globe. Data sources include scientific publications, clinical trials, conference presentations and claims among others. In addition to developing the necessary data pipelines to keep every piece of information updated in real-time and provide the users with relevant insights, the teams are also building automated, scalable and low-latency systems for the recognition and linking of various types of entities, such as linking researchers and physicians to their scholarly research and clinical trials. As we rapidly expand the markets we serve and the breadth and depth of data we want to collect for our customers, the team must grow and scale to meet that demand.
WHAT YOU'LL DO AT H1As a Software Engineer on the Data Engineering team, you will be key in analyzing vast amounts of data and providing user support within an AWS cloud environment. You’ll be responsible for writing production grade pipelines using big data technologies and data wrangling to support our internal product. You’ll manage projects across all stages including application deployment to deliver the best scalable, stable, and high-quality healthcare data application in the market.
You will:- Be responsible for product features related to data transformations, enrichment, and analytics.- Work closely with internal stakeholders, gathering requirements, delivering solutions, while effectively communicating progress and tracking tasks to meet project timelines. - Act as a subject matter expert for Real World Evidence (RWE) data (claims, publications, payments), and represent the data commercially with customers, in collaborations with the product team, and in presentations to the ELC- You’ll work within end-to-end delivery of data to produce and shape the direction of RWE data at H1- Help steer the technical strategy and architecture, ensuring the smooth development, deployment, and scalability of applications across the entire technology stack.- Collaborate closely with our Insights/AI team to build knowledge into our data and AI/ML platforms- Work cross-functionally across the engineering, data, and product organizations to support your team in delivering the best healthcare data application in the market
ABOUT YOUYou possess robust hands-on technical expertise encompassing both conventional and non-conventional ETL methodologies, alongside proficiency in T-SQL and Spark-SQL. Your skill set includes mastery of multiple programming languages such as Python (PySpark), Java, or Scala, as well as adeptness in streaming and other advanced data processing techniques. As a self-starter, you excel in managing projects across all stages, from requirement gathering and design to coding, testing, implementation, and ongoing support. Your proactive approach and diverse skill set make you an invaluable asset in driving innovation and delivering impactful solutions within our dynamic data engineering team.
REQUIREMENTS - 3+ years of experience working with strong big data engineering teams and deploying products on AWS- Strong coding skills in Python (PySpark), Java, Scala or any proficient language of choice and stacks supporting large scale data processing - Experience with Docker, Kubernetes or Terraform.- Experience with databases like PostgreSQL- Software management tools such as Git, JIRA, and CircleCI- Strong grasp of computer science fundamentals: data structures, algorithmic trade-offs, etc.- Experience with data processing technologies like Spark Streaming, Kafka Streaming, K-SQL , Spark SQL, or Map/Reduce- Understanding on various distributed file formats such as Apache AVRO, Apache Parquet and common methods in data transformation- Experience in performing root cause analysis on internal and external data and processes to answer specific business questions and find opportunities for improvement- Should be willing to manage projects through all the stages (requirements, design, coding, testing, implementation, and support).- Ability to write clean, modular data processing code that is easy to maintain.
Not meeting all the requirements but still feel like you’d be a great fit? Tell us how you can contribute to our team in a cover letter!
H1 OFFERS- Full suite of health insurance options, in addition to generous paid time off- Pre-planned company-wide wellness holidays- Retirement options- Health & charitable donation stipends- Impactful Business Resource Groups- Flexible work hours & the opportunity to work from anywhere- The opportunity to work with leading biotech and life sciences companies in an innovative industry with a mission to improve healthcare around the globe
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Architecture Avro AWS Big Data Computer Science Data pipelines Docker Engineering ETL Excel Git Java Jira Kafka Kubernetes Machine Learning Parquet Pipelines PostgreSQL PySpark Python Research Scala Spark SQL Streaming Terraform Testing T-SQL
Perks/benefits: Career development Flex hours Flex vacation
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Business Intelligence Engineer jobs
- Open Data Engineer II jobs
- Open Lead Data Analyst jobs
- Open Power BI Developer jobs
- Open Marketing Data Analyst jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Science Manager jobs
- Open MLOps Engineer jobs
- Open Junior Data Scientist jobs
- Open Business Data Analyst jobs
- Open Data Scientist II jobs
- Open Product Data Analyst jobs
- Open Business Intelligence Developer jobs
- Open Data Analytics Engineer jobs
- Open Sr Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Senior Data Architect jobs
- Open Principal Data Scientist jobs
- Open Sr. Data Scientist jobs
- Open Azure Data Engineer jobs
- Open Big Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Research Scientist jobs
- Open Junior Data Engineer jobs
- Open Data Quality Analyst jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open Data quality-related jobs
- Open ML models-related jobs
- Open Business Intelligence-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open PhD-related jobs
- Open Deep Learning-related jobs
- Open Data visualization-related jobs
- Open NLP-related jobs
- Open Finance-related jobs
- Open PyTorch-related jobs
- Open TensorFlow-related jobs
- Open APIs-related jobs
- Open LLMs-related jobs
- Open Generative AI-related jobs
- Open Consulting-related jobs
- Open CI/CD-related jobs
- Open Snowflake-related jobs
- Open Kubernetes-related jobs
- Open Hadoop-related jobs
- Open Data governance-related jobs
- Open Databricks-related jobs
- Open Data warehouse-related jobs