Senior Data Engineer
Waltham, Massachusetts, United States
Applications have closed
Statistics & Data Corporation (SDC)
The Senior Data Engineer will design and automate the cleaning, processing, and analyzing of both clinical data and nonclinical data with the goal of transforming data into information that supports a better understanding of the safety and efficacy of new clinical therapies. The role contributes to the organization’s strong drive to be at the forefront of using Artificial Intelligence (AI) in clinical trials to simplify data processing and discover imperceptible correlations. A Fully remote opportunity is available if desired.
This is a full-time, non-contract role. Recruiters and agencies- please do not submit any candidates for consideration.
Primary Responsibilities
- Designs and develops data architecture for new and existing applications and data sources
- Establishes Data Quality, Data Governance and Master Data Management Best Practices to enable the business to maintain clean and accurate data
- Designs ETL frameworks and features to allow for robust and scalable data pipelines
- Incorporate new pipelines into the existing data model, augmenting as needed
- Manages and maintains several production systems
- Engages with various internal cross-functional departments to strategically design, develop and implement data pipelines while understanding the underlying data
- Increases team productivity by developing, identifying, and implementing better tools and processes
- Exemplifies good documentation, coding, and testing best practices
- Develops standard operating procedures for the use of artificial intelligence and data engineering principles within clinical trials
- Prototypes new ideas/technologies to create proof of concept and demos
- Provides mentorship for other data engineers
- Assist in executing responsibilities of Data Engineers including the following:
- Maintains and develops ETL pipelines
- Maintain an Enterprise Data Warehouse by updating and translation logic and support warehouse servers.
- Follow Data Quality, Data Governance and Master Data Management Best Practices
- Delivers high quality software design documentation
- Prototypes new ideas/technologies to create proof of concept and demos
- Contributes to the development of standard operating procedures
- Performs other related duties incidental to the work described herein
- Adherence to all essential systems and processes that are required at SDC to maintain compliance to business and regulatory requirements
- Act as a resource for other team members for debugging, code reviews and other software development lifecycle activities.
- Contract Research Organization experience and familiarity with its operations
- The above statements describe the general nature and level of work being performed by individuals assigned to this classification. This document is not intended to be an exhaustive list of all responsibilities and duties required of personnel so classified.
Requirements
- Fluency in Python with experience parsing, manipulating and converting data to and from a wide range of formats (CSV, json, XML, html, SQL tables, etc.)
- Deep understanding of modern RDBMS concepts (triggers, indexes, views, stored procedures) and SQL syntax, including experience with at least one modern RDBMS
- Ability to design efficient data warehouse following dimensional modeling principles for scalable reporting.
- Solid understanding of multiple database systems (No-SQL, SQL)
- System design capabilities to improve operational efficiency and costs.
- Experience in the software development lifecycle.
- Ability to mentor other data engineers and increase team productivity
- Ability to work with stakeholders to translate business requirements into clear technical specifications
- Ability to communicate effectively in writing and verbally.
- Ability to identify issues, present problems, and implement solutions
- The capability of communicating technical concepts clearly, concisely, and understandably to non-technical colleagues
- Good leadership, organizational, and time management skills, with the ability to multi-task
- Strong interpersonal communication and presentation skills
Education or Equivalent Experience
- Bachelor’s degree in a technical field with 7 years of technical experience with the last ~4 years being in a Data Engineering centric role
- OR
- 10+ years in a technical role with the last ~5 years being in a Data Engineering centric role
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Architecture Classification CSV Data governance Data management Data pipelines Data quality Data warehouse Engineering ETL JSON Pipelines Python RDBMS Research SQL Testing XML
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Data Science Manager jobs
- Open Lead Data Analyst jobs
- Open MLOps Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Engineer II jobs
- Open Data Manager jobs
- Open Sr Data Engineer jobs
- Open Power BI Developer jobs
- Open Principal Data Engineer jobs
- Open Data Analytics Engineer jobs
- Open Business Intelligence Developer jobs
- Open Junior Data Scientist jobs
- Open Data Scientist II jobs
- Open Product Data Analyst jobs
- Open Senior Data Architect jobs
- Open Sr. Data Scientist jobs
- Open Business Data Analyst jobs
- Open Big Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Manager, Data Engineering jobs
- Open Azure Data Engineer jobs
- Open Data Quality Analyst jobs
- Open Data Product Manager jobs
- Open Junior Data Engineer jobs
- Open Principal Data Scientist jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open GCP-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Java-related jobs
- Open Privacy-related jobs
- Open Data visualization-related jobs
- Open Finance-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open Snowflake-related jobs
- Open Consulting-related jobs
- Open TensorFlow-related jobs
- Open PhD-related jobs
- Open CI/CD-related jobs
- Open NLP-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open LLMs-related jobs
- Open Databricks-related jobs
- Open Data warehouse-related jobs