Graph Data Engineer
Austin, Texas, United States
Olive
Olive is purpose-built for healthcare, improving operational efficiency for provider and payer teams with intelligent automation.Olive’s AI workforce is built to fix our broken healthcare system by addressing healthcare’s most burdensome issues -- delivering hospitals and health systems increased revenue, reduced costs, and increased capacity. People feel lost in the system today and healthcare employees are essentially working in the dark due to outdated technology that creates a lack of shared knowledge and siloed data. Olive is designed to drive connections, shining a new light on the broken healthcare processes that stand between providers and patient care. She uses AI to reveal life-changing insights that make healthcare more efficient, affordable and effective. Olive’s vision is to unleash a trillion dollars of hidden potential within healthcare by connecting its disconnected systems. Olive is improving healthcare operations today, so everyone can benefit from a healthier industry tomorrow.
Olive is searching for experienced engineers to provide technical leadership and guidance as we build our technology platform. Data Engineers within Olive Graph work with the Olive Product Management team to deliver value in our Olive Graph . We encourage a growth mindset amongst all of our engineers and value those with the drive to be continuously expanding industry knowledge. A successful Data Engineer will possess strong analytical as well as technical skills, and have the ability to communicate the logic behind technical decisions to non-technical stakeholders.
Responsibilities (to include but not limited to):
- Create data transformation pipelines with Airflow, Gitlab, and various graph data toolkits to convert numerous, heterogeneous data sources into entries in the Olive Knowledge Graph
- Work alongside the Ontology Engineers, Platform Engineers, and Product to ensure high quality data
- Analyze data pipelines and make the necessary changes to optimize performance.
- Diagnose and resolve issues promptly and in accordance with maintainability goals.
- Work with a variety of technical and non-technical people.
- Embrace changing requirements.
- Create and maintain efficient, reliable infrastructure with code
- Drive automation using popular cloud orchestration, configuration management, and CI/CD system
- Design and implement:
- Solutions to consume from sources like data lakes, RDBMS, and NoSQL data layers
- Data quality check frameworks
- Alerting and monitoring for overall data stack
- Scalable data pipelines
- Work with languages such as: SQL, SPARQL, Python, Java, Bash
Requirements
- Bachelor’s in Computer Science, Mathematics, Statistics, Physics or relevant equivalent experience
- 5+ years of Data Engineering or data warehousing experience
- A strong understanding of operating systems, networking, and software engineering fundamentals
- Experience using AWS or other virtualized infrastructure
- Experience managing a container-based microservice architecture, including orchestration, service-discovery, monitoring, and debugging
- Proficient in a scripting language (e.g. Bash, Python, Ruby, Perl, PowerShell, etc.)
- Experience orchestrating infrastructure using CloudFormation, Terraform, or other similar tooling.
- Experience building in Linux and Windows systems (e.g. AWS Linux 2, Ubuntu, CentOS, ContainerLinux, etc.)
- Strong experience with SQL and No-SQL databases (e.g. MySQL, PostgreSQL, Oracle, MongoDB, SQL Server)
Ideal Experience:
- Experience with Big data solutions like Spark/Hadoop/Hive
- Experience with streaming infrastructure like Kafka, Kinesis or Apache Beam
- Experience with Data Lakes (Lake Formation/Snowflake) and Lake querying technologies (e.g. Athena, Redshift)
- Deploying or managing infrastructure across AWS AZs and regions.
- Experience with semantic web technologies (e.g. RDF, SPARQL, OWL)
- Knowledge of RDF engines such as Apache Jena Fuseki, Stardog, or AWS Neptune
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow Athena AWS Big Data CI/CD Computer Science Data pipelines Data Warehousing Engineering GitLab Hadoop Kafka Kinesis Linux Mathematics MongoDB MySQL NoSQL Oracle Perl Physics Pipelines PostgreSQL Python RDBMS RDF Redshift Ruby Snowflake Spark SQL Statistics Streaming Terraform
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open MLOps Engineer jobs
- Open Lead Data Analyst jobs
- Open Data Science Manager jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Manager jobs
- Open Data Engineer II jobs
- Open Principal Data Engineer jobs
- Open Power BI Developer jobs
- Open Sr Data Engineer jobs
- Open Business Intelligence Developer jobs
- Open Junior Data Scientist jobs
- Open Data Analytics Engineer jobs
- Open Product Data Analyst jobs
- Open Data Scientist II jobs
- Open Sr. Data Scientist jobs
- Open Business Data Analyst jobs
- Open Senior Data Architect jobs
- Open Data Analyst Intern jobs
- Open Big Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Data Quality Analyst jobs
- Open Research Scientist jobs
- Open Junior Data Engineer jobs
- Open GCP-related jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open ML models-related jobs
- Open Java-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open Deep Learning-related jobs
- Open PhD-related jobs
- Open APIs-related jobs
- Open TensorFlow-related jobs
- Open PyTorch-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open CI/CD-related jobs
- Open LLMs-related jobs
- Open Kubernetes-related jobs
- Open Generative AI-related jobs
- Open Data governance-related jobs
- Open Hadoop-related jobs
- Open Airflow-related jobs
- Open Docker-related jobs