Intern, Data Engineer

London City, London, GB

Internship Entry-level / Junior

Copyright Clearance Center

Collective licensing pioneer CCC helps you integrate, access, and share information through licensing, content, software and professional services.

View company page

Apply now Apply later

Posted 2 weeks ago

Job Overview:

We are looking for a Data Engineer Intern that can work with the Architecture team on some internal initiatives. These initiatives encompass exploratory work, analytics as well as nascent services and products. The Data Engineer Intern will be allocated to one of these initiatives and will work with the Architecture team.

Our analytics stack includes the use of Spark / pyspark for bulk processing, Zeppelin notebooks and Airflow for process orchestration and data profiling, graph and relational databases for storage, R for visualization, and a variety of techniques for statistical analyses and machine learning.

The individual must possess oral and written English communications skills and will gain experience of working with a cross-functional engineering team.

Experience with AWS is a plus. n

Primary Responsibilities:

Work with product owners and technical staff to integrate, profile and analyze internal and external data sets to provide data into the viability and quality of potential and existing CCC data offerings.
Participates as a team member in analysis, development, implementation, testing and documentation of data engineering projects, setting and meeting realistic timelines and deadlines.
Ensures that design and code review occur in a timely manner and that systems are documented.

Requirements:

Python and/or R programming
Experience with databases, querying, reporting and ETL
Practiced in working with multiple data sets, creating combined views, measuring data quality, and applying insights to business problems
Experience working with APIs to query and obtain data
An understanding of fuzzy matching, entity matching/deduplication would be beneficial
The ability to track and evaluate experiments, communicate findings and propose next steps based on the outcomes
Familiar with GitHub for version control, Jira for task/issue tracking, and structured approaches to working on data-centric tasks (such as CRISP-DM)
Ability to work both independently and collaboratively, subject to peer review
Capable of setting and meeting deadlines
Excellent analytical, interpretative and interpersonal skills, backed up by the ability to convey meaningful information through verbal and written communication
May be accountable for other results and activities as assigned.

Apply now Apply later

Share this job via
or

Tags: Airflow APIs Architecture AWS Data quality Engineering ETL GitHub Jira Machine Learning PySpark Python R RDBMS Spark Statistics Testing

Region: Europe

Country: United Kingdom

Job stats: 27 3 0

Category: Engineering Jobs

More jobs like this

« Back to job search To the top ↑

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.

Intern, Data Engineer

London City, London, GB

Internship Entry-level / Junior

Copyright Clearance Center

More jobs like this

Data Engineer (m/w/d)

Big Data Infrastructure Engineering / DevOps Intern (50-100%)

PreMaster Programm - Data Engineering

Data Engineer Graduate

Traineeship Data Engineer (m/w/d)

Intern: Machine Learning Engineer (Collaboration or Graduation)

Junior Research Engineer (e-Xperience Associate)

Visiting Research Engineer, FAIR (London or Paris)

Stage - DATA Engineer (H/F)

Internship - Payload Satellite Engineering: Big Data Sets

Explore more AI, ML, Data Science career opportunities