Data Acquisition Engineer - NLP

New York / Remote

Applications have closed

CertiK

CertiK is the leading security-focused ranking platform to analyze and monitor blockchain protocols and DeFi projects.

View company page

About the CompanyFounded in 2018 by professors of Yale University and Columbia University, CertiK is a pioneer in blockchain security, utilizing best-in-class AI technology to secure and monitor blockchain protocols and smart contracts. CertiK’s mission is to secure the cyber world. Starting with blockchain, CertiK applies cutting-edge innovations from academia into enterprise, enabling mission-critical applications to be built with security and correctness.
CertiK is one of the fastest growing and most trusted companies in blockchain security and has become a true market leader. To date, we have collectively worked with over 1800 enterprise clients, helped secure over $310 billion worth of digital assets, and detected over 31,000 vulnerabilities in blockchain code. Our clients include leading projects such as Aave, Polygon, Binance Smart Chain, Terra, Yearn, and Chiliz.
CertiK just raised over $140 million and backed by Coatue, Tiger Global, Sequoia, and Hillhouse Capital.
About the RoleCertiK is looking for a Data Acquisition Engineer to help build and maintain a data ingestion and organization system for our team of data scientists, focusing on social media and language data from platforms such as Twitter, Reddit, Instagram, Telegram, and more.
About YouYou are interested in helping us build a world-class data ingestion and analytical system, focusing on social media and other language data. You are creative, have great attention to detail, and obsessed with optimizing and improving computational processes. You are up to the challenge of helping build a data infrastructure from the ground-up and making a huge impact from day one.

Responsibilities

  • Help design and develop data pipelines
  • Identify and find new data sources, and integrate them into our data ecosystem
  • Maintain the collection and processing of data from a variety of sources, specifically social media data
  • Work with data scientists to establish project feasibility, requirements, and other data analysis tasks
  • Monitor and maintain data quality and propose ideas to speed up and improve team processes
  • Work with data scientists to implement data cleaning and pre-processing algorithms

Requirements

  • B.S. degree in Computer Science, Statistics, Data Science, or related field or equivalent experience
  • Expertise in data warehouses such as Snowflake or big data query engines such as Presto / Spark
  • Strong familiarity with data APIs, web scraping, and pre-processing of raw unstructured data
  • Ability to work well with others and communicate problems and findings clearly
  • Experience with data DevOps tools such as airflow, amusden, kafka, or others
  • Solid fundamental computer science knowledge like data structure, algorithms, testing, CI/CD, GIT, and shell scripts
  • Prior experience in a fast-paced, growing start-up environment is a plus


CertiK is proud to offer medical, vision, and dental insurance, 401(k) plan with company matching, life and accidental death and dismemberment insurance, HSA (with high deductible plan), FSA, and other benefits to all full-time employees, along with flexible paid time off and holidays. 
In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification form upon hire.
CertiK is proud to be an equal opportunity employer. We will not discriminate against any applicant or employee on the basis of age, race, color, creed, religion, sex, sexual orientation, gender, gender identity or expression, medical condition, national origin, ancestry, citizenship, marital status or civil partnership/union status, physical or mental disability, pregnancy, childbirth, genetic information, military and veteran status, or any other basis prohibited by applicable federal, state or local law.
CertiK will consider for employment qualified applicants with criminal histories in a manner consistent with local and federal requirements.https://www.eeoc.gov/sites/default/files/migrated_files/employers/poster_screen_reader_optimized.pdf
All CertiK employees are expected to actively support diversity on their teams, and in the Company.

Tags: Airflow APIs Big Data Blockchain CI/CD Computer Science Data analysis Data pipelines DevOps Git Kafka NLP Pipelines Security Snowflake Spark Statistics Testing Unstructured data

Perks/benefits: Flex vacation Health care Insurance Startup environment

Regions: Remote/Anywhere North America
Country: United States
Job stats:  9  0  0

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.