Data Engineer II
New York City, United States
Paige
Paige is transforming the way pathologists work through a comprehensive digital pathology platform and generative AI applications.Paige is a technology company helping pathologists, oncologists and clinicians make faster, more informed diagnostic and treatment decisions in cancer, through the use of advanced Artificial Intelligence (AI). We are uniquely positioned to do this by mining decades of data from the world’s experts in cancer care and are now leading in the digital transformation in cancer pathology and diagnostics.
Paige is the first company to develop clinical grade AI tools for the pathologist, which resulted in our receiving the first FDA approval for an AI product in pathology. Paige has also received FDA-clearance for our digital viewer, FullFocusTM and recently, CE-IVD clearance in Europe and UK for our diagnostic AI in prostate and breast cancer. We have also established multiple relationships with biopharma, laboratory and equipment manufacturers that enables Paige to develop a full solution for busy laboratories, ready to help patients receive better diagnoses and treatment.
In this role, we are seeking a Data Engineer II to work on exciting projects as part of a joint partnership between Paige and Memorial Sloan Kettering (MSK). While working as an employee of MSK, you will be an integral member of their Digital Pathology team focused on Paige funded projects whereby you will meet the challenge of converting pathology from an analog to a digital practice. Your work will include creating pipelines to consolidate and deliver vital data that will drive research and analytics focused on improving patient care.
Key Responsibilities
- Engineer infrastructure and tools that automate delivering critical data to downstream systems and industry partners
- Design and build backend integrations using various technologies and taking a creative approach when working with legacy systems
- Engineer features of the data platform that will help ensure quality and robustness
- Collaborate in an agile team with Product Owners, Scrum Masters, System Architects, other Development Teams and Users
- Participate in full SAFe and Agile development life cycle including analysis, design, build and release of data pipelines
About You
- Proficient in relational database schema and query design (i.e., SQL)
- Proficient in using scripting languages such as Python and bash
- Proficient with industry standard tools for ETL design and automation (e.g., DataStage, Airflow, Prefect)
- Familiar with Linux environments and server administration
- Experience using Docker, Kubernetes or similar container technologies and setting up CI/CD pipelines is a plus
- Experience designing RESTful APIs is a plus
- Experience and familiarity with Cloud providers (AWS, Azure) is a plus
- You are a highly skilled data engineer who is comfortable working with large volumes of data
- Can design and develop complex solutions using modern practices and technologies
- You consider yourself an eager, self-starting learner who can quickly pick up new technologies
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Airflow APIs AWS Azure CI/CD Data pipelines Docker ETL Kubernetes Linux Pipelines Python Research Scrum SQL
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Lead Data Analyst jobs
- Open MLOps Engineer jobs
- Open Data Science Manager jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Manager jobs
- Open Data Engineer II jobs
- Open Power BI Developer jobs
- Open Principal Data Engineer jobs
- Open Sr Data Engineer jobs
- Open Business Intelligence Developer jobs
- Open Data Analytics Engineer jobs
- Open Junior Data Scientist jobs
- Open Data Scientist II jobs
- Open Product Data Analyst jobs
- Open Senior Data Architect jobs
- Open Sr. Data Scientist jobs
- Open Business Data Analyst jobs
- Open Big Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Manager, Data Engineering jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Data Quality Analyst jobs
- Open Junior Data Engineer jobs
- Open Principal Data Scientist jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open GCP-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Java-related jobs
- Open Privacy-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open TensorFlow-related jobs
- Open PhD-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open NLP-related jobs
- Open CI/CD-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open Hadoop-related jobs
- Open Airflow-related jobs
- Open LLMs-related jobs
- Open Generative AI-related jobs
- Open Databricks-related jobs