Senior Data Engineer
We are a fast-growing and pioneering people analytics company that is transforming the financial workplace. We use cutting-edge software and machine learning to generate previously unidentifiable insights into employee behavior and performance. We have been recognized by renowned companies such as Amazon Web Services and Google Cloud for our achievements in AI, big data analytics, and machine learning. We have also been included in the Forbes FinTech 50, CB Insights AI 100, and Tech Nation’s prestigious Future 50 program.
Our goal is to help businesses achieve better outcomes by developing and delivering data-driven solutions for compliance, CRM, HR, and workplace productivity. We also aim to rapidly expand our worldwide customer base to include companies across all major industries.
About the role
The Data Operations team is responsible for the management of datasets that are used by our analysts and engineers to train and test our analytics and machine learning applications. These datasets come from diverse sources and have many different use cases. Typically they are made up of communications data such as emails, chats and phone calls, but they could also take the form of annotated lists, spreadsheets, or other data structures. Datasets can be small and specialised or they can be very large. They contain industry-specific concepts and terminology, and cover a growing list of natural languages. They may have undergone processes such as annotation of artifacts, classification, or anonymisation / pseudonymisation.
Despite this variation and complexity, it is vital that our datasets are named, formatted, stored, categorised and catalogued consistently, so that it is clear what they contain and so that their valuable contents can be located and used when needed. This will enable rapid, coordinated and effective development of applications across the Data Science functions.
The Senior Data Engineer is responsible for the technical solutions and processes that facilitate and automate the dataset lifecycle. Key responsibilities include the creation of tooling, automation and processes related to:
- Standardised data formats and naming conventions
- Updating datasets and integrating data from different sources
- Classification and labelling of datasets and individual content items
- Storage and version management
In this role you will have the opportunity to define new technical solutions and you will see how your work contributes directly to a team that is helping to make the corporate and financial world a safer and fairer place. If you enjoy working on challenging tasks and are ready to apply your technical and data management skills to cutting edge technologies, then come join us and become a part of our team!
Ideal candidate profile
- Experience in python for data analysis and visualization (jupyter notebooks / pandas / matplotlib / seaborn);
- Experience working with databases like SQL;
- Experience with management and manipulation of very large datasets;
- Good understanding of Amazon AWS infrastructure including s3 API / CLI;
- Strong knowledge of git;
- Good understanding of CI/CD automation tooling (for example, Jenkins, Gitlab/Bitbucket pipelines)
What we offer
- A highly accomplished and global team
- Competitive salary
- Fully covered health benefits
- Training and mentoring opportunities
Selected candidates will be invited for a phone interview to discuss their skills, experiences, and interests. Promising candidates will be invited to meet the Behavox team and deliver a presentation. Those who demonstrate the highest potential will be invited for a final executive-level interview.