Data Engineer
Pune, India
IntegriChain
Company Description
IntegriChain is the data and application backbone for market access departments of Life Sciences manufacturers. We deliver the data, the applications, and the business process infrastructure for patient access and therapy commercialization. More than 250 manufacturers rely on our ICyte Platform to orchestrate their commercial and government payer contracting, patient services, and distribution channels. ICyte is the first and only platform that unites the financial, operational, and commercial data sets required to support therapy access in the era of specialty and precision medicine. With ICyte, Life Sciences innovators can digitalize their market access operations, freeing up resources to focus on more data-driven decision support. With ICyte, Life Sciences innovators are digitalizing labor-intensive processes – freeing up their best talent to identify and resolve coverage and availability hurdles and to manage pricing and forecasting complexity.
We are headquartered in Philadelphia, PA (USA), with offices in: Ambler, PA (USA); Pune, India; and Medellín, Colombia. For more information, visit www.integrichain.com, or follow us on Twitter @IntegriChain and LinkedIn.
We are excited to offer a hybrid working environment for our employees. To be successful in this role, there are some requirements to visit our office locations from quarterly to a few times a quarter for workshops, team meetings and collaboration between teams.
Job Description
- Develop, support, and refine new data pipelines, data models, business logic, data schemas as code, and analytics to product specifications.
- Prototype and optimize data type checks to ensure data uniformity prior to load.
- Develop, and refine both streaming and batch processing data pipeline frameworks.
- Maintain, improve, and develop expertise in existing production data, models, and algorithms.
- Learn and utilize business data domain knowledge and its correlation to underlying data sources.
- Define, document, and maintain a data dictionary including: data definitions, data sources, business meaning and usage of information.
- Identify and validate opportunities to reuse existing data and algorithms.
- Works with stakeholders to gather requirements on merging, de-duplicating, standardizing data.
- Collaborate on design and implementation of data standardization procedures.
- Share team responsibilities; such as contributing to development of data warehouses and productizing algorithms created by Data Science team members.
Qualifications
- 4-6 years of experience building data pipelines and using ETL tools(Must-have).
- 2+ years of experience in ETL tools like Talend /Jaspersoft ETL tools(Must-have)
- 2+ years of experience in SQL programming language (Must-have) .
- Strong in writing stored procedures and sql queries(Must-have).
- 2+ years of experience in python programming (Must-have) .
- Exposure in development (Tsql/PL-sql) including concepts like Index, Views, Trigger, recursive CTE,pivot/unpivot and writing complex Queries.
- Knowledge of any tool for scheduling and orchestration of data pipelines or workflows (preferred- Airflow)(nice to have)
- 1+ years experience developing modern, industry standard big data frameworks with AWS or other cloud services(Must-have).
- Experience with common GitHub developer practices and paradigms.
- Experience working with agile methodologies and cross-functional teams.
- Knowledge in building AWS data pipelines using python, S3 data lake(nice to have).
- Knowledge of redshift or any other columnar database is preferred.
- Experience with AWS services including S3, Redshift, EMR (nice to have) Knowledge of distributed systems as it pertains to data storage and computing Knowledge of specialty pharmaceutical and retail pharmacy is a plus.
- Good to have knowledge of the Data integration process.
- Ability to effectively communicate with both business and technical teams.
Additional Information
What does IntegriChain have to offer?
- Mission driven: Work with the purpose of helping to improve patients' lives!
- Excellent and affordable medical benefits + non-medical perks including Flexible Paid Time Off
- Robust Learning & Development opportunities including over 700+ development courses free to all employees
#LI-DV1
IntegriChain is committed to equal treatment and opportunity in all aspects of recruitment, selection, and employment without regard to race, color, religion, national origin, ethnicity, age, sex, marital status, physical or mental disability, gender identity, sexual orientation, veteran or military status, or any other category protected under the law. IntegriChain is an equal opportunity employer; committed to creating a community of inclusion, and an environment free from discrimination, harassment, and retaliation.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Airflow AWS Big Data Data pipelines Distributed Systems ETL GitHub Pharma Pipelines Python Redshift SQL Streaming Talend T-SQL
Perks/benefits: Career development Flex hours Flex vacation Health care
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Marketing Data Analyst jobs
- Open Lead Data Analyst jobs
- Open MLOps Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Engineer II jobs
- Open Principal Data Engineer jobs
- Open Data Manager jobs
- Open Power BI Developer jobs
- Open Data Scientist II jobs
- Open Sr Data Engineer jobs
- Open Business Data Analyst jobs
- Open Junior Data Scientist jobs
- Open Data Analytics Engineer jobs
- Open Product Data Analyst jobs
- Open Business Intelligence Developer jobs
- Open Data Analyst Intern jobs
- Open Sr. Data Scientist jobs
- Open Senior Data Architect jobs
- Open Big Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Principal Data Scientist jobs
- Open Azure Data Engineer jobs
- Open Data Quality Analyst jobs
- Open Junior Data Engineer jobs
- Open Research Scientist jobs
- Open Data quality-related jobs
- Open GCP-related jobs
- Open Business Intelligence-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open PhD-related jobs
- Open Deep Learning-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open PyTorch-related jobs
- Open APIs-related jobs
- Open TensorFlow-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open LLMs-related jobs
- Open Generative AI-related jobs
- Open CI/CD-related jobs
- Open Kubernetes-related jobs
- Open Hadoop-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Databricks-related jobs