Data Science Co-op

United States - Remote

Authenticate

Authenticate.com is a verification platform that provides Knowledge Based Authentication, Photo ID & Passport Verification, Age, Employment, Education & Criminal Background Checks, MVR & DMV Records, Email, SMS, FEIN & SSN Verification.

View company page

Location: Remote

Job Type: Co-op or Internship

Duration: 3-6 months

About Us:

Authenticate.com is a leading provider of identity verification and background check solutions. Our innovative platform helps businesses prevent fraud, ensure compliance, and build trust with their users. We offer a wide range of verification services, including document verification, facial recognition, database checks, and continuous monitoring.

Job Summary:

We are seeking a highly motivated and detail-oriented Data Scientist Co-op to join our team. As a Data Scientist Co-op, you will play a critical role in developing and maintaining our data infrastructure, with a focus on creating vector databases and utilizing Large Language Models (LLMs) to normalize data for criminal history and employment history. This is an excellent opportunity to apply your data science skills to real-world problems and contribute to the development of innovative solutions in the identity verification and background screening space.

Responsibilities:

·       Design, develop, and maintain vector databases to store and query large datasets related to criminal history and employment history

·       Utilize Large Language Models (LLMs) to normalize and standardize data from various sources, ensuring consistency and accuracy

·       Collaborate with cross-functional teams to integrate vector databases and LLM-based data normalization into our background screening and identity verification products

·       Develop and implement data quality control processes to ensure data accuracy, completeness, and integrity

·       Analyze and visualize data to identify trends, patterns, and insights that can inform product development and improvement

·       Stay up-to-date with industry trends and advancements in natural language processing, machine learning, and data science

·       Communicate technical results and insights to non-technical stakeholders through clear and concise reporting

Requirements:

·       Currently enrolled in a Bachelor's or Master's degree program in Computer Science, Data Science, Mathematics, Statistics, or a related field

·       Strong programming skills in Python, with experience in data science libraries such as NumPy, Pandas, and scikit-learn

·       Familiarity with vector databases and Large Language Models (LLMs) such as BERT, RoBERTa, or DistilBERT

·       Experience with data preprocessing, normalization, and feature engineering

·       Knowledge of data visualization tools such as Matplotlib, Seaborn, or Plotly

·       Excellent problem-solving skills, with the ability to work independently and collaboratively as part of a team

·       Strong communication and interpersonal skills, with the ability to explain technical concepts to non-technical stakeholders

Nice to Have:

·       Experience with cloud-based data storage solutions such as AWS S3 or Google Cloud Storage

·       Familiarity with containerization using Docker and orchestration using Kubernetes

·       Knowledge of data governance and compliance regulations such as GDPR and CCPA

·       Experience with agile development methodologies and version control systems such as Git

What We Offer:

·       Competitive co-op salary for full time co-ops, or complete flexibility and self-determination for unpaid interns

·       Opportunity to work on cutting-edge projects in the identity verification and background screening space

·       Collaborative and dynamic work environment with a team of experienced data scientists and engineers

·       Professional development opportunities, including training and mentorship

·       Flexible work arrangements, including remote work options

If you are a motivated and detail-oriented individual with a passion for data science and machine learning, please submit your resume, cover letter, and any relevant projects or code samples. We look forward to hearing from you!

Apply now Apply later
  • Share this job via
  • or

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Agile AWS BERT Computer Science Data governance Data quality Data visualization Docker Engineering Feature engineering GCP Git Google Cloud Kubernetes LLMs Machine Learning Mathematics Matplotlib NLP NumPy Pandas Plotly Python Scikit-learn Seaborn Statistics

Perks/benefits: Career development Competitive pay

Regions: Remote/Anywhere North America
Country: United States
Job stats:  34  12  0

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.