Software Systems Engineer, Data Pipelines
Remote
Fyusion, Inc.
Here's the day to day:
- Automate, patch and otherwise develop data processing systems written in Python/C++ and orchestrated using Docker/Kubernetes
- Maintain data storage and retrieval systems, while consistently responding to needs for enhanced sorting, searching, and transformation
- Work with embedding spaces, keeping vector search operations fast and insightful
- Ensure the integrity and security of all proprietary data using industry best practices and up-to-date standards
- Track progress in model training pipelines and ensure that benchmarks are being met and exceeded
- Maximize the productivity vs cost for compute resources, both cloud and on-prem
- Troubleshoot ML inference by identifying training bottlenecks and outliers in training data
- Present analysis and insights with clear descriptions and visual aids as needed
Here's what we're looking for:
- 5+ years of experience in data engineering, data pipelines, ML Ops, or similar field
- Proficiency in Python, working knowledge (or better) of C++
- Solid programming fundamentals: programming paradigms, design patterns, data structures and algorithms, complexity analysis, concurrency
- Experience with data indexing and retrieval toolsets (example: ELK stack). You can explain in great detail how these tools work under-the-hood and the tradeoffs associated with their use
- Experience using containers and with container orchestration (Docker/Kubernetes preferred)
- Knowledge of ML model training techniques, trade offs, and pitfalls
- Knowledge of vector search: similarity metrics, encoding embeddings, ANN search algorithms
- Software development best practices: version control (Git), designing and maintaining automated tests, giving and receiving code review, taking an active role in regular release cycles
Bonus points for:
- Math fundamentals: statistics, linear algebra, calculus
- Experience with data science tools: SciPy, R, Matlab or similar
- Open source contributor for a relevant tool
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: ANN Computer Vision Data pipelines Deep Learning Docker ELK Engineering Git Kubernetes Linear algebra Machine Learning Mathematics Matlab Model training Open Source Pipelines Python R Research SciPy Security Statistics
Perks/benefits: Career development Competitive pay Health care Salary bonus Unlimited paid time off
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Data Engineering Manager jobs
- Open Data Manager jobs
- Open BI Analyst jobs
- Open Sr. Data Scientist jobs
- Open MLOps Engineer jobs
- Open Business Intelligence Analyst jobs
- Open Data Engineer II jobs
- Open Product Data Analyst jobs
- Open Big Data Engineer jobs
- Open Power BI Developer jobs
- Open Sr Data Engineer jobs
- Open Senior Manager, Data Science jobs
- Open Lead Data Analyst jobs
- Open Director, Data Engineering jobs
- Open Data Analytics Engineer jobs
- Open (Senior) Digital Analytics Engineer jobs
- Open Data Engineer (Remote) jobs
- Open Senior Data Architect jobs
- Open Junior Data Engineer jobs
- Open Business Data Analyst jobs
- Open Principal Data Scientist jobs
- Open Clinical Data Manager jobs
- Open Manager, Data Engineering jobs
- Open Lead Machine Learning Engineer jobs
- Open Research Scientist jobs
- Open Excel-related jobs
- Open Data quality-related jobs
- Open Power BI-related jobs
- Open Privacy-related jobs
- Open Business Intelligence-related jobs
- Open APIs-related jobs
- Open Consulting-related jobs
- Open Deep Learning-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open PyTorch-related jobs
- Open Airflow-related jobs
- Open TensorFlow-related jobs
- Open Data management-related jobs
- Open NLP-related jobs
- Open Kubernetes-related jobs
- Open PhD-related jobs
- Open Scala-related jobs
- Open Kafka-related jobs
- Open Hadoop-related jobs
- Open Snowflake-related jobs
- Open Data warehouse-related jobs
- Open Docker-related jobs
- Open Data governance-related jobs
- Open DevOps-related jobs