Staff Data Engineer

Austin, TX or Pittsburgh, PA

Gecko Robotics, Inc.

Discover how Gecko Robotics provides unprecedented visibility into asset health of critical infrastructure through robotic inspections and software solutions.

View company page

Who We Are:

Gecko Robotics (Gecko) is focused on protecting and maintaining civilizations' most critical infrastructure, with machines and platforms.

Traditional manual maintenance is dangerous, slow and reactive. Gecko’s solutions improve safety and speed (10x) while gathering more data (1000x) to predict infrastructure failures before they happen.

Gecko services customers across the USA and internationally primarily in the Oil & Gas, Power Generation and Pulp & Paper markets.

 

Job At A Glance

The Data Processing team develops the software that analyzes, transforms, and visualizes our robotic data so that our customers can minimize their shutdowns and get their assets repaired and back online. Software engineers on the Data Processing team are responsible for the services that make up our data processing pipeline: ingestion, preprocessing, filtering, cleaning, validation, API's, etc. Our team is currently designing for scale as Gecko continues to push the boundaries on big data.

What You Will Do

  • Design and implement a fast, flexible data platform that maximizes Gecko's ability to deliver its unique value proposition to robotic inspection customers.
  • Interact with engineers and data analysts to design robust data API’s.
  • Lead design and code reviews.
  • Use critical thinking skills to debug problems and develop solutions to challenging technical problems.
  • Interact with other engineers from multiple disciplines in a team environment.
  • Develop tests to ensure the integrity and availability of the platform.
  • Provide and review technical documentation.

What Gecko Is Looking For

  • 7+ years of experience building cloud-based data pipelines
  • Bachelor’s and/or Master’s degree in Computer Science, or equivalent experience.
  • Demonstrated ability to write performant, scalable code in Python.
  • Dedication to test-driven development and designing production-ready systems.
  • Dedication to software engineering best practices leveraging automation tools.
  • Experience with container runtimes (Docker, etc).
    • Experience with cloud data stores (S3, Google Cloud Storage, etc.).
  • Experience scaling systems horizontally and vertically.
  • Experience with batch-oriented workflows.

Nice-to-haves:

  • NoSQL database: DynamoDB, BigTable, FireStore, etc.
  • RDBMS: PostgreSQL, Redshift, Aurora, Google CloudSql, etc.
  • Workload orchestration: K8s, Hashicorp Nomad, etc.
  • Serverless technologies: AWS: Lambda/Batch, GCP: Cloud Functions/Cloud Run, etc.
  • Distributed workflows: AWS Step Functions, GCP Cloud Workflows/Cloud Composer, Airflow, Spark, etc.
  • PyData stack: pandas, numpy, dask, etc.
  • Jupyter notebooks/Google Collab or equivalent.

Tags: Airflow APIs AWS Big Data Bigtable Computer Science Data pipelines Docker DynamoDB Engineering GCP Google Cloud Jupyter Lambda NoSQL NumPy Pandas Pipelines PostgreSQL Python RDBMS Redshift Robotics Spark TDD

Region: North America
Country: United States
Job stats:  8  0  0

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.