Research Engineer - Data Quality
New York City
Character.AI
Meet AIs that feel alive. Chat with anyone, anywhere, anytime. Experience the power of super-intelligent chat bots that hear you, understand you, and remember you.Character’s mission is to empower everyone with AGI. Our vision is to enable people with our technology so that they can use Character.AI any moment of any day.
Character.AI is one of the world’s leading personal AI platforms. Founded in 2021 by AI pioneers Noam Shazeer and Daniel De Freitas, Character.AI is a full-stack AI company with a globally scaled direct-to-consumer platform. As of 2023 that platform was #2 in the space in user engagement. Character.AI is uniquely centered around people, letting users personalize their experience by interacting with AI “Characters.” The company achieved unicorn status in 2023 and was named Google Play’s AI App of the Year.
Noam co-invented the key tech powering LLMs and was recently named to TIME100’s Most Influential People in AI list. TIME called him “one of the most important and impactful people of the space’s past, present, and future.” Daniel created and led LaMDA, the breakthrough conversational tech project currently powering Bard.
To learn more, please visit beta.character.ai.
About the roleAs a passionate data miner, you wield big data tools and visualization software to line up research developments. You love uncovering interesting subsets of data, creating clear dashboards to communicate your findings, and proposing ideas for unexplored opportunities. You are also extremely interested in learning how to support large language model development with unparalleled for data quality and performance insights.
You are an ML engineer, data engineer, or data scientist who wants to work with world-class LLM researchers to curate, develop, and analyze our data catalog. Your responsibilities are threefold:
Curate, mine, and analyze datasets for LLMs
Work with our Product org to identify datasets needed for specific user experiences
Help maintain and improve core tables in our data lake used for research across the company
Data is the lifeblood of AI. Alongside the data platform team, you will be responsible for making sure this vital resource is available, understood, and of the highest quality.
Who we’re looking for
Required Experience:
5+ years of experience
Familiarity with Machine Learning and NLP and willingness to learn more on the job
Experience mining text and graphical data
Data visualization skills
SQL Wizardry
Spark Experience
Passionate about Conversational AI or large language models
Additional Desired Experience:
Experience with cloud platforms like GCP
Experience with Kubernetes
Experience training your own LLMs
You will be a good fit if you are proactive and have a “get things done” mindset. Given our current pace of growth and load on our systems, most people have had a significant impact during their first week at the company.
Character is an equal opportunity employer and does not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status. We value diversity and encourage applicants from a range of backgrounds to apply.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: AGI Bard Big Data Conversational AI Data quality Data visualization GCP Kubernetes LLMs Machine Learning ML models NLP Research Spark SQL
Perks/benefits: Career development
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Lead Data Analyst jobs
- Open Senior Business Intelligence Analyst jobs
- Open MLOps Engineer jobs
- Open Data Manager jobs
- Open Data Science Manager jobs
- Open Principal Data Engineer jobs
- Open Data Engineer II jobs
- Open Sr Data Engineer jobs
- Open Power BI Developer jobs
- Open Product Data Analyst jobs
- Open Business Intelligence Developer jobs
- Open Data Scientist II jobs
- Open Junior Data Scientist jobs
- Open Data Analytics Engineer jobs
- Open Business Data Analyst jobs
- Open Sr. Data Scientist jobs
- Open Senior Data Architect jobs
- Open Data Analyst Intern jobs
- Open Big Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Junior Data Engineer jobs
- Open Data Quality Analyst jobs
- Open Data Product Manager jobs
- Open Principal Data Scientist jobs
- Open Azure Data Engineer jobs
- Open GCP-related jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Data visualization-related jobs
- Open Finance-related jobs
- Open Deep Learning-related jobs
- Open PhD-related jobs
- Open APIs-related jobs
- Open TensorFlow-related jobs
- Open PyTorch-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open CI/CD-related jobs
- Open LLMs-related jobs
- Open Generative AI-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open Hadoop-related jobs
- Open Airflow-related jobs
- Open Databricks-related jobs