Software Engineer (all levels) - Data, NLP
New York / Remote
Applications have closed
CertiK
CertiK is the leading security-focused ranking platform to analyze and monitor blockchain protocols and DeFi projects.CertiK is one of the fastest growing and most trusted companies in blockchain security and has become a true market leader. To date, we have collectively worked with over 1800 enterprise clients, helped secure over $310 billion worth of digital assets, and detected over 31,000 vulnerabilities in blockchain code. Our clients include leading projects such as Aave, Polygon, Binance Smart Chain, Terra, Yearn, and Chiliz.
CertiK just raised over $140 million and backed by Coatue, Tiger Global, Sequoia, and Hillhouse Capital.
About the RoleCertiK is looking for a Data Engineer familiar with natural language processing to work with Data Scientists analyzing social media data from platforms such as Twitter, Reddit, Instagram, Telegram, and more.
About YouYou enjoy playing with and finding insights from data. Specifically, you have experience with or are interested in the process of analyzing human language data using cutting-edge computational techniques.
You participate or enjoy browsing the most popular social media platforms and are interested in blockchain and its applications such as cryptocurrency, smart contracts, and Web3. You believe it is important to keep this decentralized ecosystem healthy and secure.
As a data engineer, you will work with data scientists and other engineers to acquire social media data, clean and process it, and help analyze it using various NLP and machine learning technologies.
Responsibilities
- Help design and develop data pipelines
- Identify and find new data sources, and integrate them into our data ecosystem
- Maintain the collection and processing of data from a variety of sources, specifically social media data
- Work with data scientists to establish project feasibility, requirements, and other data analysis tasks
- Implement cutting-edge NLP and machine-learning frameworks and libraries
- Assist with the development, deployment, and maintenance of analytical micro-services
- Monitor and maintain data quality and propose ideas to speed up and improve team processes
- Pay attention to the performance and cost of implemented algorithms, and optimize and tune them
Requirements
- B.S. degree in Computer Science, Statistics, Data Science, or related field or equivalent experience
- Expertise in data warehouses such as Snowflake or big data query engines such as Presto / Spark, and Python data analytical libraries such as pandas and numpy
- Strong familiarity with data APIs and pre-processing of raw unstructured data
- Some familiarity with popular social media platforms, internet meme culture, and cryptocurrency
- Ability to work well with others and communicate problems and findings clearly
- Familiarity with machine and deep learning frameworks such as PyTorch, TensorFlow, FastText, HuggingFace, sci-kit, gensim, or others is a plus
- Experience with data DevOps tools such as airflow, amusden, kafka, or others is a plus
In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification form upon hire.
CertiK is proud to be an equal opportunity employer. We will not discriminate against any applicant or employee on the basis of age, race, color, creed, religion, sex, sexual orientation, gender, gender identity or expression, medical condition, national origin, ancestry, citizenship, marital status or civil partnership/union status, physical or mental disability, pregnancy, childbirth, genetic information, military and veteran status, or any other basis prohibited by applicable federal, state or local law.
CertiK will consider for employment qualified applicants with criminal histories in a manner consistent with local and federal requirements.https://www.eeoc.gov/sites/default/files/migrated_files/employers/poster_screen_reader_optimized.pdf
All CertiK employees are expected to actively support diversity on their teams, and in the Company.
Tags: Airflow APIs Big Data Blockchain Computer Science Data analysis Data pipelines Deep Learning DevOps HuggingFace Kafka Machine Learning NLP NumPy Pandas Pipelines Python PyTorch Security Snowflake Spark Statistics TensorFlow Unstructured data
Perks/benefits: Career development Flex vacation Health care Insurance
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open MLOps Engineer jobs
- Open Lead Data Analyst jobs
- Open AI Engineer jobs
- Open Data Engineer II jobs
- Open Sr Data Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Principal Data Engineer jobs
- Open Data Manager jobs
- Open Power BI Developer jobs
- Open Data Analytics Engineer jobs
- Open Junior Data Scientist jobs
- Open Product Data Analyst jobs
- Open Senior Data Architect jobs
- Open Data Scientist II jobs
- Open Business Intelligence Developer jobs
- Open Sr. Data Scientist jobs
- Open Manager, Data Engineering jobs
- Open Data Quality Analyst jobs
- Open Big Data Engineer jobs
- Open Business Data Analyst jobs
- Open Data Analyst Intern jobs
- Open ETL Developer jobs
- Open Principal Data Scientist jobs
- Open Research Scientist jobs
- Open Data Product Manager jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open GCP-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open ML models-related jobs
- Open Java-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open Consulting-related jobs
- Open TensorFlow-related jobs
- Open Snowflake-related jobs
- Open PhD-related jobs
- Open NLP-related jobs
- Open CI/CD-related jobs
- Open Data governance-related jobs
- Open Kubernetes-related jobs
- Open Databricks-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open LLMs-related jobs
- Open Generative AI-related jobs