AI Data Specialist
Cambridge, MA
Kensho
Kensho develops cutting-edge products and technologies that transform businesses. We are the AI Innovation Hub for S&P Global.Kensho’s AI Data Team is the data labeling component of the Machine Learning team - we build datasets for model training, evaluate model performance on real-world data, and manage an offshore team of data labelers. As a data specialist, you will collaborate with ML engineers to plan data annotation projects; extract, transform, and load data for those projects; work with our offshore annotation team to create high quality datasets; and analyze finished datasets. The ideal candidate is comfortable working with structured and unstructured data (preferably using Python), is able to collaborate with colleagues across Kensho and S&P, can solve problems independently and as a part of a team process, and is eager to learn new skills.
At Kensho, we believe in flexibility-first, and give our employees the opportunity to work from where they feel most productive and engaged (must be in the United States). We also value in-person collaboration, so there may be times when travel to one of our Kensho hubs (e.g., Cambridge, MA or NYC) will be required for team meetings or company events.
Kensho states that the anticipated base salary range for the position is 90k -125k. In addition, this role is eligible for an annual incentive bonus and potential equity plans. At Kensho, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case.
What You’ll Do:
- Collaborate with other ML teams at Kensho to plan data labeling projects: developing ontologies, drafting instructions, and establishing project timelines and milestones
- Extract, transform, and load data from a variety of sources into data labeling tools
- Monitor and assist our offshore data labelers as they label data, and review their annotations
- Create final datasets for ML teams once labeling is completed, and analyze the finalized data for trends and edge cases
- Assist with training and mentoring our offshore data labelers
- Assist with ad-hoc, high-visibility data projects as needed
- Auditing current labeling pipelines and practices, collaborating with ML Ops and ML Eng on model monitoring pipelines
- A secondary focus of this role includes training and assigning our S&P auxiliary labeling team projects. The role may include onboarding S&P analysts onto our internal tool, distributing instruction sets, identifying and addressing edge cases, refining annotation guidelines, and pinpointing areas / cases where ML models struggle
What You'll Need:
- 2+ years of industry experience designing, building, evaluating, and maintaining robust datasets
- 0-1 years of management or mentorship experience
- Experience mentoring and/or building a team. We are looking for someone who can think long term about hiring and training
- Experience partnering and collaborating with product and technical teams
- An innovation-oriented mindset and the ability to come up with out-of-the-box solutions
- A thoughtful and collaborative approach as a teammate
- Refined organizational skills and result-oriented attitude with superior attention to detail
- Measuring your professional success by your team’s success
Technology You'll Encounter:
- These languages and tools are ones that our team uses - or will eventually be using - on a regular basis. They are not prerequisites for successfully applying for this role, but demonstrated knowledge of some or all of these tools and languages - or a proven ability to quickly learn new tools and languages - will stand out.
- Data transformation and analysis: Python, Pandas, Numpy, Matplotlib
- Deployment: Airflow
- Datastores: S3, SQLite
- Annotation tools: Labelbox
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow ASR Classification Machine Learning Matplotlib ML models Model training NLP NumPy Pandas Pipelines Python Unstructured data
Perks/benefits: Career development Conferences Equity Health care Medical leave Parental leave Pet friendly Salary bonus Startup environment Team events Unlimited paid time off
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Data Science Manager jobs
- Open Marketing Data Analyst jobs
- Open MLOps Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Engineer II jobs
- Open Principal Data Engineer jobs
- Open Data Manager jobs
- Open Power BI Developer jobs
- Open Data Scientist II jobs
- Open Sr Data Engineer jobs
- Open Business Data Analyst jobs
- Open Junior Data Scientist jobs
- Open Data Analytics Engineer jobs
- Open Product Data Analyst jobs
- Open Business Intelligence Developer jobs
- Open Data Analyst Intern jobs
- Open Sr. Data Scientist jobs
- Open Senior Data Architect jobs
- Open Big Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Principal Data Scientist jobs
- Open Azure Data Engineer jobs
- Open Data Quality Analyst jobs
- Open Junior Data Engineer jobs
- Open Research Scientist jobs
- Open Data quality-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open Business Intelligence-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open PhD-related jobs
- Open Deep Learning-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open PyTorch-related jobs
- Open APIs-related jobs
- Open TensorFlow-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open LLMs-related jobs
- Open Generative AI-related jobs
- Open CI/CD-related jobs
- Open Kubernetes-related jobs
- Open Hadoop-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Databricks-related jobs