Senior Data Architect

Remote - United States

TetraScience

The Tetra Scientific Data and AI Cloud is the only vendor-neutral, open, cloud-native platform purpose-built for science. Get next-generation lab data automation, scientific data management, and foundational building blocks of Scientific AI....

View company page

Who We Are

TetraScience is the R&D Data Cloud company, solving humanity's grand challenges by accelerating and improving scientific outcomes. The Tetra R&D Data Cloud provides life sciences companies with the flexibility, scalability, and data-centric capabilities to enable easy access to compliant, engineered, liquid, and actionable scientific data ('Tetra Data'). As an open platform, TetraScience has built the largest network of life sciences innovators, including instrument makers, informatics solution providers, CRO/CDMOs, visualization, analytics, and data science partners — creating seamless interoperability and empowering an innovation feedback loop to drive the future of life sciences and harness the power of the world's scientific data.

Our core values are designed to guide our behaviors, actions, and decisions such that we operate as one. We are looking to add individuals to our team that demonstrate the following values:

  • Transparency and Context- We trust our people will make the right decisions and overcome any challenges when given data and context.
  • Trust and Collaboration- We believe there can only be trust when there is transparency. We are committed to always communicating openly and honestly.
  • Fearlessness and Resilience- We proactively run toward challenges of all types. We embrace uncertainty and we take calculated risks.
  • Alignment with Customers- We are completely committed to ensuring our customers and partners achieve their missions and treat them with respect and humility.
  • Commitment to Craft- We are passionate missionaries. We sweat the details, as the small things enable the big things.
  • Equality of Opportunity- We seek out the best of the best regardless of gender, ethnicity, race, or age. We seek out those who embody our common values but bring unique and invaluable perspectives, talents and advantages.

Who You Are

You thrive on working well with others. You make the people around you better. You love to collaborate with fellow team members, customers, field engineers, executives, and inspire them to do their best.

You relentlessly strive to excel in your craft. You are passionate about building, observing and operating distributed systems at scale in production. You understand the challenges and trade-offs to be made when building and deploying new systems to production and are willing to challenge the boundaries of the scale.

You consistently seek understanding and clarity. You look at every interaction as an opportunity to learn. You aren’t afraid to ask questions. You have the humility and confidence to not be the smartest person in the room.

What You Will Do

Data ingested and harmonized in this ecosystem is accessible through ElasticSearch REST API and SQL. Data harmonization involves transformation of primary data to a vendor-neutral json format that is validated against schemas developed by our team of scientific data experts. As Principal / Lead Data Architect, you will own the evolution and development of TetraScience Scientific Data Model Schemas. Your responsibilities will include:

  • Developing and maintaining opinionated attribute names for scientific research data from a variety of instruments and domains
  • Thought leader on data schematization and/or ontology for cross-vendor data produced by similar instruments
  • Documentation of best practices for data schema development and version control
  • Collaboration with product team and end users to design data structures optimized for downstream use cases, including sending data to lab informatics systems, querying data via API, and analytics on BI tools

Requirements

What You Have Done

  • 10+ years in scientific data, ideally in the instrumentation and/or lab informatics space
  • Proven experience in data harmonization and architecture in SQL, NoSQL, and/or a Lab Informatic environment
  • Experience addressing and developing process around challenges such as:
    • Term consistency across diverse data sets (for example, metadata or ontology management)
    • Designing data schema architecture that balances iteration and stability
  • Experienced data consumer, capable of considering the implications of database design on usage in analytics, search, and ML (e.g. LIMS / ELN or other lab informatics software; Jupyter Notebooks; Dashboard development; BI tools like Tableau & Spotfire)
  • Strong history of Python and SQL
  • Exceptional written and verbal communication skills
  • Experience working with Biopharma data and techniques, such as Liquid Chromatography and Mass Spectroscopy; or proven track record of quickly learning the nuances of complex domains

Benefits

  • 100% employer-paid benefits for all eligible employees and immediate family members
  • Unlimited paid time off (PTO)
  • 401K
  • Flexible working arrangements - Remote work + office as needed
  • Company paid Life Insurance, LTD/STD


No visa sponsorship is available for this position

Tags: APIs Distributed Systems Elasticsearch Excel JSON Jupyter Machine Learning NoSQL Python R R&D Research REST API Spotfire SQL Tableau

Perks/benefits: Career development Flex hours Flex vacation Transparency Unlimited paid time off

Regions: Remote/Anywhere North America
Country: United States
Job stats:  1  0  0
Category: Architecture Jobs

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.