Principal Data Engineer
Cambridge, MA USA
Applications have closed
Flagship Pioneering, Inc.
We are Flagship Pioneering We are a biotechnology company that invents platforms and builds companies that change the world. CEO Chats from the Flagship…What if… you could tell the story of an organization that conceives, creates, resources, and develops first-in-category bioplatform companies to transform human health and sustainability?
Since its launch in 2000 Flagship Pioneering has, through its Flagship Pioneering Labs unit, applied its unique hypothesis-driven innovation process to originate and foster more than 100 scientific ventures, resulting in over $130 billion in aggregate value. To date, Flagship has deployed over $2.5 billion in capital toward the founding and growth of its pioneering companies alongside more than $19 billion of follow-on investments from other institutions.
Flagship Pioneering presents compelling opportunities for impact-focused technology professionals:
- The Flagship Labs ecosystem is composed of new and growing companies with widely varying needs that span all aspects of technology, presenting incredible opportunities for learning and development.
- Flagship creates new companies that operate without the burden of legacy systems or technical debt. Individuals that understand the technical opportunities and challenges in company growth will have the opportunity to positively impact and influence the trajectory of dozens of companies.
- Flagship believes that the sophisticated use of modern digital and data technologies is a strategic differentiator for companies, with the potential to significantly alter their research, productivity, market position and long-term success. To deliver on this mission, Flagship seeks individuals who have a proven track record of solving enterprise-scale challenges while thriving in a dynamic startup culture.
Position Summary
The Principal Data Engineer will report to the Senior Director, Cloud Architecture and be part of the Engineering Data & Systems team, which supports Scientific Systems, Data, and Data Science needs for both Flagship Pioneering and Enterprise companies. This role will have responsibility for providing cloud based data solutions, best practices, and guidance to Flagship Pioneering and enterprise companies and be a key member in designing and building our Digital Backbone.
A typical day may involve developing API-centric data flow patterns between systems, authoring reference implementations for AWS scientific computing workloads, automating bioinformatics pipelines to analyze large datasets, creating crawlers for extracting, loading and transforming datasets, generally making it easier for data scientists to use their cloud environments and tools.
Key Responsibilities:
- Develop high quality, production ready code with well documented and testable APIs
- Work in a collaborative CI/CD software development environment, including use of git, participating in code reviews and independent development of robust code
- Build and maintain cloud-based infrastructure, blueprints, documentation and training materials
- Design and implement flexible cloud-based solutions of varying levels of sophistication
- Refine methods of data ingestion from external sources (e.g. CROs, data owners) to cloud accounts
- Provide expert guidance, code examples and reference implementations for a variety of AWS workloads
- Support portfolio companies on the pragmatic implementation of FAIR principles
- Collaborate with architects and technical leadership on the design and creation of a digital backbone for our companies
- May lead small teams of consultants to deliver solutions
Qualifications & Experience
- 3+ years data engineering experience
- Familiarity with datalake and lakehouse architectures
- AWS certifications or equivalent experience
- Expertise with various database technologies, including RDBMS, NoSQL, graph dbs
- Experience modeling data
- Proficiency with python, shell scripting, IaC, pipeline frameworks (e.g. NextFlow) and DevOps best practices
- A people person with an exceptional ability to collaborate with users of different backgrounds
Tags: APIs Architecture AWS CI/CD DevOps Engineering Git NoSQL Pipelines Python RDBMS Research Shell scripting
Perks/benefits: Career development Startup environment
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Lead Data Analyst jobs
- Open MLOps Engineer jobs
- Open Data Science Manager jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Manager jobs
- Open Data Engineer II jobs
- Open Power BI Developer jobs
- Open Principal Data Engineer jobs
- Open Sr Data Engineer jobs
- Open Data Analytics Engineer jobs
- Open Business Intelligence Developer jobs
- Open Junior Data Scientist jobs
- Open Data Scientist II jobs
- Open Product Data Analyst jobs
- Open Senior Data Architect jobs
- Open Sr. Data Scientist jobs
- Open Business Data Analyst jobs
- Open Big Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Manager, Data Engineering jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Data Quality Analyst jobs
- Open Junior Data Engineer jobs
- Open Principal Data Scientist jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open GCP-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Java-related jobs
- Open Privacy-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open TensorFlow-related jobs
- Open PhD-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open NLP-related jobs
- Open CI/CD-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open LLMs-related jobs
- Open Generative AI-related jobs
- Open Databricks-related jobs