Data Engineer - AWS, Python, Pipelines

Dublin, County Dublin, Ireland

TetraScience

The Tetra Scientific Data and AI Cloud is the only vendor-neutral, open, cloud-native platform purpose-built for science. Get next-generation lab data automation, scientific data management, and foundational building blocks of Scientific AI....

View company page

TetraScience provides the world’s first and only R&D Data Cloud, with a mission to transform life sciences R&D, accelerate discovery, and improve human life. Scientists at global pharma and biotech organizations rely on our innovative Tetra Data Platform for easy access to centralized, harmonized, and actionable scientific data to accelerate their digital lab transformation. With best-in-class SaaS performance, a team of industry innovators, and excellent product/market fit, Tetra is positioned to become an iconic life sciences software company.

Who We Are

You thrive on working well with others. You make the people around you better. You love to collaborate with fellow team members, customers, field engineers, executives, and inspire them to do their best.

You relentlessly strive to excel in your craft. You are passionate about building, observing and operating distributed systems at scale in production. You understand the challenges and trade-offs to be made when building and deploying new systems to production and are willing to challenge the boundaries of the scale.

You consistently seek understanding and clarity. You look at every interaction as an opportunity to learn. You aren’t afraid to ask questions. You have the humility and confidence to not be the smartest person in the room.

    What You Will Do

    • Own, prototype, and implement customer solutions
    • Research and prototype data acquisition strategy for scientific lab instrumentation
    • Research and prototype file parsers for instrument output files (.xlsx, .pdf, .txt, .raw, .fid, many other vendor binaries)
    • Design and build data models
    • Design and build Python data pipelines, unit tests, integration tests, and utility functions
    • Build visualization, report, and dashboards using Spotfire, Tableau, Jupyter notebook and etc.
    • Work with the customer to test and make sure the solution fulfills their requirements and solves their need
    • Coordinate project kickoff meetings; manage the customer relationship throughout the project, and conduct formal project closeout meetings
    • Facilitate internal project post-mortems to identify areas of improvement on the next implementation

    Requirements

    • >5 years in Python and SQL
    • Passionate about science and building solutions to make the data more accessible to the end-users
    • Elasticsearch, science background, or experience with scientific instruments
    • Experience with tools like Spotfire, Tableau, Jupyter notebook (any of them)
    • Undergraduate or graduate degree in chemistry, biology, computer science, statistics, public health, etc.
    • Excellent communications skills, attention to details, and the confidence to take control of project delivery
    • Quickly understand a highly technical product and effectively communicate with product management and engineering
    • Strong project, account management, and proactive problem-solving skills
    • High-bandwidth: thrives when managing multiple simultaneous projects
    • Intellectually curious: Unwavering drive to learn and know more every day
    • Ability to think creatively on how to solve projects risks without reducing quality
    • Team player and ability to "roll up your sleeves" and do what it takes to make the team successfu

    Tags: AWS Biology Chemistry Computer Science Data pipelines Distributed Systems Elasticsearch Engineering Excel Jupyter Pipelines Python R R&D Research Spotfire SQL Statistics Tableau

    Region: Europe
    Country: Ireland
    Job stats:  20  0  0
    Category: Engineering Jobs

    More jobs like this

    Explore more AI, ML, Data Science career opportunities

    Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.