Data Engineering Internship
San Jose, CA
Applications have closed
Vectra
Vectra AI's Threat Detection and Response Platform protects your business from cyberattacks by detecting attackers in real time and taking immediate action.Vectra® is the leader in AI-driven threat detection and response for hybrid and multi-cloud enterprises.
The Vectra Platform captures packets and logs across network, public cloud, SaaS, and identity by applying patented security-led AI to surface and prioritize threats for rapid threat response. Vectra's threat detections are powered by a deep understanding of attacker methods and problem-optimized AI algorithms. Alerts uncover attacker methods in action and are correlated across customer environments to expose real attacks. Organizations around the world rely on Vectra to see and stop threats before a breach occurs. For more information, visit www.vectra.ai.
Position Overview
Detecting attackers in real-time requires robust data pipelines that enable machine learning and statistical techniques. As an intern for the Data Engineering team, you will help transform rich network traffic data, cloud log data into meaningful features and develop data systems for collecting algorithm telemetry. You will be involved with building pipelines and tools for both on-prem and cloud deployments while collaborating with Data Scientists and Software Engineers in the process.
Responsibilities
- Work with the Data Engineers on the team to improve and develop new features enabling Data Scientists to access data in ways previously unavailable
- Possible projects range from
- Building out a data converter to parquet format and catalog using AWS Glue
- Performing ETL on existing data to restructure time series data in a more accessible format
- Automate the piping of network captures into a process to convert into metadata and load into Spark
Qualifications
- Required
- Working towards a BS or MS in Computer Science or related field
- Strong programming skills with experience in Python, C++, or Java
- Linux proficiency and shell scripting
- Desirable
- Experience with Docker, Kubernetes or other container orchestration tool
- Experience working with AWS or GCP offerings
- Experience with a source control system, preferably Git
- Familiarity with Hadoop, Map/Reduce, Spark, and distributed computing
- Understanding of data pipeline architectures (e.g. Lambda, Kappa)
- Database hands-on experience (MySQL, MongoDB, couchdb, ElasticSearch, etc.)
- Knowledge of real-time data pipelines (e.g. Kafka and Spark Streaming)
- Experience with continuous integration and deployment workflows
A two-minute video that describes what we do at Vectra, and an article about Vectra's last funding round:
https://vimeo.com/89579264
https://tcrn.ch/3gVAXNw
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Architecture AWS Computer Science Data pipelines Docker Elasticsearch Engineering ETL GCP Git Hadoop Kafka Kubernetes Lambda Linux Machine Learning MongoDB MySQL Parquet Pipelines Python Security Shell scripting Spark Statistics Streaming
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Marketing Data Analyst jobs
- Open MLOps Engineer jobs
- Open Junior Data Scientist jobs
- Open AI Engineer jobs
- Open Data Engineer II jobs
- Open Senior Data Architect jobs
- Open Sr Data Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Analytics Engineer jobs
- Open Power BI Developer jobs
- Open Manager, Data Engineering jobs
- Open Product Data Analyst jobs
- Open Principal Data Engineer jobs
- Open Business Data Analyst jobs
- Open Data Quality Analyst jobs
- Open Data Manager jobs
- Open Sr. Data Scientist jobs
- Open Data Scientist II jobs
- Open Big Data Engineer jobs
- Open Business Intelligence Developer jobs
- Open Data Analyst Intern jobs
- Open Principal Data Scientist jobs
- Open ETL Developer jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Business Intelligence-related jobs
- Open Data quality-related jobs
- Open Privacy-related jobs
- Open Data management-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open Deep Learning-related jobs
- Open APIs-related jobs
- Open PyTorch-related jobs
- Open PhD-related jobs
- Open Consulting-related jobs
- Open TensorFlow-related jobs
- Open Snowflake-related jobs
- Open NLP-related jobs
- Open Data governance-related jobs
- Open Data warehouse-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open Databricks-related jobs
- Open LLMs-related jobs
- Open DevOps-related jobs
- Open CI/CD-related jobs