Principal Cloud Data Engineer
Remote - USA
Do you want to make a meaningful contribution to society by securing critical infrastructure and knowing that your work was impactful? Our software codifies our knowledge and experience, delivering an intelligent, orchestrated, and automated approach to asset protection, threat detection, analysis, and response. The Dragos Neighborhood Keeper Program is not just a technology solution but is a partnership with industry leaders to create a solution that maps to gaps in the community’s ability to respond to OT threats. Dragos is looking for a self-motivated and enthusiastic Principal Cloud Data Engineer interested in developing solutions to help safeguard the world’s industrial infrastructure in a highly collaborative team at Dragos. We set our goal as best in class and are looking for team players who set these same standards for themselves.
- Create and maintain optimal data pipeline architecture
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
- Support data ingest of network data into the Elastic stack (ELK - Elasticsearch, Logstash, Kibana, Airflow)
- Design and develop highly scalable code to support analytics used to detect cyber threat activity
- Write well-designed, testable, and efficient code
- Contribute to all phases of the development lifecycle (Agile/Scrum)
- Prepare and produce releases of software components (Atlassian stack)
- Support continuous improvement by investigating and presenting alternative technologies for team review
- Refactoring and improving existing code for performance and simplicity
- Write automated unit tests that will ensure the integrity of our software
- 3-5 years’ experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
- Strong analytic skills related to working with unstructured datasets.
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
- A successful history of manipulating, processing and extracting value from large disconnected datasets.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Intellectual curiosity to find new and unusual ways of how to solve data management issues.
- Strong knowledge of building and interacting with REST APIs.
- Familiar with serverless system architectures and Infrastructure as Code design patterns
- Experience with Elasticsearch (Index Configuration, Sharding, Partitioning, Aliases, Performance Tuning Clusters) or similar technology stack is highly desired
- Knowledge of software development principles and agile methodology
- Experience with Elastic Stack, including Elasticsearch, Logstash, Kibana, Apache Airflow, RabbitMQ, or related technologies
- US Citizen or Permanent Resident
- Background check
Nice to Have
- Experience with HashiCorp automation and security tools, specifically Vault and Terraform
- Understanding and experience with data science algorithms and application of machine learning to drive correlation and clustering
Job region(s): North America Remote/Anywhere
Job stats: 17 0 0