Machine Learning Engineer
Krakow, Lesser Poland, Poland
As a Data Engineer in Ocado Smart Platform, you will be responsible for expanding and optimising our cloud based Data Lake architecture. In this position, your responsibilities will include designing and building data marts to serve specific business lines, creating best practises and defining data processing strategies for our Platform. This gives you the unique opportunity to work with, and influence, how state of the art technologies provided by Google Cloud Platform (BigQuery, DataFlow, Cloud Storage) will evolve - the same that Google is using internally to process extreme amounts of data.
You will be responsible not only for development, maintenance and support of scalable and well monitored data processing workflows but also for creating toolkits which help all Ocado teams to enable data driven smartness in our products.
You will work with an exceptional team of engineers, as well as a product owner to identify and scope projects, plan those projects and effectively communicate advanced technical issues and findings to a range of technical and non-technical internal audiences. You will be an advocate of a Data Driven Company paradigm and will be able to evangelize others about data processing best practices and share the knowledge. You will have a chance to work and cooperate with various development teams to deliver specific features and remove blockers to launching new clients.
We are looking for people who are experts and get things done by using their smarts and whatever tools make sense to get the job done. People who love to stand on the shoulders of giants to solve new problems and thrive in a rapidly innovating space.
Desired skills & competencies
- Ability to write and optimise SQL queries
- Knowledge of Java o
- Knowledge of databases and best engineering practices
- Data structuring and design skills
- Strong verbal and written communication in English and Polish
- Previous experience working within the Map-Reduce programming model (e.g., Hadoop), Spark or Apache Beam
- Knowledge of massive parallel processing approach
- Previous experience working with data sets measured in terabytes
- Knowledge of data warehousing principles
- Knowledge of Google Cloud Platform (Compute Engine, BigQuery, DataFlow, DataProc, Cloud Storage)
- Good understanding of data protection issues
- Be passionate about Big Data
- Able to explore new technology, strong innovation skills
- Good communication skills
- Able to assess strengths and weaknesses of technology or approaches and communicate their best application
- Able to operate independently towards goals
- Able to coordinate well with other teams