Data Science Co-op
United States - Remote
Authenticate
Authenticate.com is a verification platform that provides Knowledge Based Authentication, Photo ID & Passport Verification, Age, Employment, Education & Criminal Background Checks, MVR & DMV Records, Email, SMS, FEIN & SSN Verification.Location: Remote
Job Type: Co-op or Internship
Duration: 3-6 months
About Us:
Authenticate.com is a leading provider of identity verification and background check solutions. Our innovative platform helps businesses prevent fraud, ensure compliance, and build trust with their users. We offer a wide range of verification services, including document verification, facial recognition, database checks, and continuous monitoring.
Job Summary:
We are seeking a highly motivated and detail-oriented Data Scientist Co-op to join our team. As a Data Scientist Co-op, you will play a critical role in developing and maintaining our data infrastructure, with a focus on creating vector databases and utilizing Large Language Models (LLMs) to normalize data for criminal history and employment history. This is an excellent opportunity to apply your data science skills to real-world problems and contribute to the development of innovative solutions in the identity verification and background screening space.
Responsibilities:
· Design, develop, and maintain vector databases to store and query large datasets related to criminal history and employment history
· Utilize Large Language Models (LLMs) to normalize and standardize data from various sources, ensuring consistency and accuracy
· Collaborate with cross-functional teams to integrate vector databases and LLM-based data normalization into our background screening and identity verification products
· Develop and implement data quality control processes to ensure data accuracy, completeness, and integrity
· Analyze and visualize data to identify trends, patterns, and insights that can inform product development and improvement
· Stay up-to-date with industry trends and advancements in natural language processing, machine learning, and data science
· Communicate technical results and insights to non-technical stakeholders through clear and concise reporting
Requirements:
· Currently enrolled in a Bachelor's or Master's degree program in Computer Science, Data Science, Mathematics, Statistics, or a related field
· Strong programming skills in Python, with experience in data science libraries such as NumPy, Pandas, and scikit-learn
· Familiarity with vector databases and Large Language Models (LLMs) such as BERT, RoBERTa, or DistilBERT
· Experience with data preprocessing, normalization, and feature engineering
· Knowledge of data visualization tools such as Matplotlib, Seaborn, or Plotly
· Excellent problem-solving skills, with the ability to work independently and collaboratively as part of a team
· Strong communication and interpersonal skills, with the ability to explain technical concepts to non-technical stakeholders
Nice to Have:
· Experience with cloud-based data storage solutions such as AWS S3 or Google Cloud Storage
· Familiarity with containerization using Docker and orchestration using Kubernetes
· Knowledge of data governance and compliance regulations such as GDPR and CCPA
· Experience with agile development methodologies and version control systems such as Git
What We Offer:
· Competitive co-op salary for full time co-ops, or complete flexibility and self-determination for unpaid interns
· Opportunity to work on cutting-edge projects in the identity verification and background screening space
· Collaborative and dynamic work environment with a team of experienced data scientists and engineers
· Professional development opportunities, including training and mentorship
· Flexible work arrangements, including remote work options
If you are a motivated and detail-oriented individual with a passion for data science and machine learning, please submit your resume, cover letter, and any relevant projects or code samples. We look forward to hearing from you!
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile AWS BERT Computer Science Data governance Data quality Data visualization Docker Engineering Feature engineering GCP Git Google Cloud Kubernetes LLMs Machine Learning Mathematics Matplotlib NLP NumPy Pandas Plotly Python Scikit-learn Seaborn Statistics
Perks/benefits: Career development Competitive pay
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Lead Data Analyst jobs
- Open Marketing Data Analyst jobs
- Open Data Engineer II jobs
- Open Senior Business Intelligence Analyst jobs
- Open MLOps Engineer jobs
- Open Data Manager jobs
- Open Principal Data Engineer jobs
- Open Power BI Developer jobs
- Open Data Scientist II jobs
- Open Business Intelligence Developer jobs
- Open Junior Data Scientist jobs
- Open Business Data Analyst jobs
- Open Sr Data Engineer jobs
- Open Data Analytics Engineer jobs
- Open Data Analyst Intern jobs
- Open Product Data Analyst jobs
- Open Sr. Data Scientist jobs
- Open Senior Data Architect jobs
- Open Big Data Engineer jobs
- Open Principal Data Scientist jobs
- Open Data Quality Analyst jobs
- Open Manager, Data Engineering jobs
- Open Research Scientist jobs
- Open Azure Data Engineer jobs
- Open Junior Data Engineer jobs
- Open Data quality-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open Business Intelligence-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open PhD-related jobs
- Open Deep Learning-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open PyTorch-related jobs
- Open TensorFlow-related jobs
- Open APIs-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open LLMs-related jobs
- Open CI/CD-related jobs
- Open Generative AI-related jobs
- Open Kubernetes-related jobs
- Open Hadoop-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Databricks-related jobs