Junior Data Engineer
Washington, District of Columbia, United States - Remote
Applications have closed
Sayari
Get instant access to public records, financial intelligence and structured business information on over 455 million companies worldwide.Sayari is looking for Junior and Mid-Level Data Engineers to join our growing team! As a member of Sayari's data team you will work with our Product and Software Engineering teams to collect data from around the globe, maintain existing ETL pipelines, and develop new pipelines that power Sayari Graph.
About Sayari:
Sayari is a venture-backed and founder-led global corporate data provider and commercial intelligence platform, serving financial institutions, legal & advisory service providers, multinationals, journalists, and governments. We are building world-class SaaS products that help our clients glean insights from vast datasets that we collect, extract, enrich, match and analyze using a highly scalable data pipeline. From financial intelligence to anti-counterfeiting, and from free trade zones to war zones, Sayari powers cross-border and cross-lingual insight into customers, counterparties, and competitors. Thousands of analysts and investigators in over 30 countries rely on our products to safely conduct cross-border trade, research front-page news stories, confidently enter new markets, and prevent financial crimes such as corruption and money laundering.
Our company culture is defined by a dedication to our mission of using open data to prevent illicit commercial and financial activity, a passion for finding novel approaches to complex problems, and an understanding that diverse perspectives create optimal outcomes. We embrace cross-team collaboration, encourage training and learning opportunities, and reward initiative and innovation. If you enjoy working with supportive, high-performing, and curious teams, Sayari is the place for you.
POSITION DESCRIPTION
Sayari’s flagship product, Sayari Graph, provides instant access to structured business information from hundreds of millions of corporate, legal, and trade records. As a member of Sayari's data team you will work with our Product and Software Engineering teams to collect data from around the globe, maintain existing ETL pipelines, and develop new pipelines that power Sayari Graph.
Requirements
What You Will Need:
- Professional experience with Python and a JVM language (e.g., Scala)
- 2+ years of experience designing and maintaining ETL pipelines
- Experience using Apache Spark and Apache Airflow
- Experience with SQL (e.g., Postgres) and NoSQL (e.g., Cassandra, TigerGraph, etc.) databases
- Experience working on a cloud platform like GCP, AWS, or Azure
- Experience working collaboratively with git
What We Would Like:
- Understanding of Docker/Kubernetes
- Understanding of or interest in knowledge graphs
Who You Are:
- Experienced in supporting and working with cross-functional teams in a dynamic environment
- Interested in learning from and mentoring team members
- Passionate about open source development and innovative technology
Benefits
- A collaborative and positive culture - your team will be as smart and driven as you
- Limitless growth and learning opportunities
- A strong commitment to diversity, equity, and inclusion
- Performance and incentive bonuses
- Outstanding competitive compensation and comprehensive family-friendly benefits, including full healthcare coverage plans, commuter benefits, 401K matching, generous vacation, and parental leave.
- Conference & Continuing Education Coverage
- Team building events & opportunities
Sayari is an equal opportunity employer and strongly encourages diverse candidates to apply. We believe diversity and inclusion mean our team members should reflect the diversity of the United States. No employee or applicant will face discrimination or harassment based on race, color, ethnicity, religion, age, gender, gender identity or expression, sexual orientation, disability status, veteran status, genetics, or political affiliation. We strongly encourage applicants of all backgrounds to apply.
#LI-Remote
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow AWS Azure Cassandra Docker Engineering ETL GCP Git Kubernetes NoSQL Open Source Pipelines PostgreSQL Python Research Scala Spark SQL
Perks/benefits: 401(k) matching Career development Competitive pay Equity Parental leave Startup environment Team events
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Marketing Data Analyst jobs
- Open MLOps Engineer jobs
- Open AI Engineer jobs
- Open Junior Data Scientist jobs
- Open Senior Data Architect jobs
- Open Data Engineer II jobs
- Open Sr Data Engineer jobs
- Open Data Analytics Engineer jobs
- Open Power BI Developer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Principal Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Product Data Analyst jobs
- Open Business Data Analyst jobs
- Open Data Manager jobs
- Open Data Quality Analyst jobs
- Open Sr. Data Scientist jobs
- Open Big Data Engineer jobs
- Open Data Scientist II jobs
- Open Business Intelligence Developer jobs
- Open Data Analyst Intern jobs
- Open Principal Data Scientist jobs
- Open ETL Developer jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open Privacy-related jobs
- Open Data management-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open Deep Learning-related jobs
- Open APIs-related jobs
- Open PyTorch-related jobs
- Open PhD-related jobs
- Open TensorFlow-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open NLP-related jobs
- Open Data governance-related jobs
- Open Data warehouse-related jobs
- Open Airflow-related jobs
- Open Databricks-related jobs
- Open Hadoop-related jobs
- Open LLMs-related jobs
- Open DevOps-related jobs
- Open Kubernetes-related jobs