Data Engineer for Marcel Product
Makati, Philippines
Applications have closed
Publicis Groupe
Company Description
Publicis Re:Sources is the technology and business services backbone of Publicis Groupe.
Publicis Groupe is the 3rd largest communications group worldwide, Leader in digital and Interactive Communication. With activities spanning 104 countries on five continents, Publicis Groupe employs approximately 80,000 professionals. Publicis Groupe offers local and international clients a complete range of communication services.
About Marcel Project:
Marcel is the AI platform that connects more than 80,000 employees at Publicis Groupe— across geographies, agencies and capabilities. Marcel helps our employees learn, share, create, connect
and grow more than ever before. Marcel connects employees to our culture, helps them master new skills, inspires them and tackles diversity and inclusion head on to help build a better world together. It’s a place where we come together every day to amplify each other as one global team.
All of this employee engagement creates over 100 million data points that power our AI-enabled knowledge graph, making the experience even more relevant for employees. And for our clients, our knowledge graph makes Marcel one of the most powerful tools ever invented for finding exactly the right expertise, teams and knowledge that we need to win in the Platform World.
Marcel is a strategic investment in our people and is aimed at being their personal growth engine in this hybrid world. This role is joining the dynamic Marcel team in helping build and evolve this product.
Job Description
Job Description
The key accountabilities for this role are, but not limited to;
· Lead and mentor a team of Data Engineers across complex Data Pipelines for a variety of consumers.
· Create and maintain optimal data pipeline architecture,
· Assemble large, complex data sets that meet functional / non-functional business requirements.
· Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
· Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and Azure ‘big data’ technologies.
· Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
· Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs..
· Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
· Work with data and analytics experts to strive for greater functionality in our data systems.
· The deliverables for each Sprint are clearly understood by the Agile Team(s).
· Ensure that the Agile team(s) delivers working software of sufficient quality to deliver to clients at the end of each development sprint.
· Source Control repositories are appropriately managed
· Agile team receives sufficient resourcing to be able to complete its objectives.
Specific responsibilities:
· Write maintainable and effective data feeds, and pipelines
· Follow best practices for test driven environment, continuous integration.
· Design, develop, test and implement end-to-end requirement
· Contribute on all phases of development life cycle
· Perform unit testing and troubleshooting applications
Qualifications
Minimum Experience (relevant): 5
Maximum Experience (relevant): 9
Must have skills:
· Strong experience in Azure, ADF, PySpark, Scala, DataBricks, SQL.
· Experience with ETLs, JSON, Hop or ETL orchestration tools.
· Experience in working with EventHub, streaming data.
· Understanding of ML models and experience in building ML pipeline, MLflow, AirFlow.
- Experience with big data tools: Hadoop, Spark, Kafka, etc.
- Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
- Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
- Experience with stream-processing systems: Storm, Spark-Streaming, etc.
· Understanding of Graph data, neo4j is a plus
· Strong knowledge Azure based services
· Strong understanding of RDBMS data structure, Azure Tables, Blob, and other data sources
· Experience with test driven development
· Experience in PowerBI or other tools
· Understanding of Jenkins, CI/CD processes using ADF, and DataBricks preferred.
Good to have skills:
- Bachelor's degree in engineering, computer science, information systems, or a related field from an accredited college or university; Master's degree from an accredited college or university is preferred
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
- Experience building and optimizing ADF and PySpark based data pipelines, architectures and data sets on Graph and Azure Datalake.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Strong analytic skills related to working with unstructured datasets.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- A successful history of manipulating, processing and extracting value from large disconnected datasets.
- Working knowledge of message queuing, stream processing, and highly scalable Azure based data stores.
- Strong project management and organizational skills.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- Understanding of Node.js is a plus, but not required
Additional Information
Attributes/behaviours
· Ability to design, develop, implement complex requirement.
· Building reusable components and front-end libraries for future use
· Translating designs and wireframes into high quality code
- Pro-active support to the business is a key attribute for this role with a customer service focus to link both systems requirements with business outcomes.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Airflow Architecture Azkaban Azure Big Data Cassandra CI/CD Computer Science Databricks Data pipelines Engineering ETL Hadoop JSON Kafka Machine Learning MLFlow ML models Neo4j Node.js NoSQL Pipelines PostgreSQL Power BI PySpark RDBMS Scala Spark SQL Streaming Testing
Perks/benefits: Career development Startup environment Team events
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Marketing Data Analyst jobs
- Open MLOps Engineer jobs
- Open Junior Data Scientist jobs
- Open AI Engineer jobs
- Open Data Engineer II jobs
- Open Senior Data Architect jobs
- Open Power BI Developer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Analytics Engineer jobs
- Open Sr Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Principal Data Engineer jobs
- Open Business Data Analyst jobs
- Open Product Data Analyst jobs
- Open Data Quality Analyst jobs
- Open Data Manager jobs
- Open Sr. Data Scientist jobs
- Open Big Data Engineer jobs
- Open Data Scientist II jobs
- Open Business Intelligence Developer jobs
- Open Data Analyst Intern jobs
- Open ETL Developer jobs
- Open Principal Data Scientist jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Business Intelligence-related jobs
- Open Data quality-related jobs
- Open Privacy-related jobs
- Open Data management-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open Deep Learning-related jobs
- Open APIs-related jobs
- Open PyTorch-related jobs
- Open PhD-related jobs
- Open Consulting-related jobs
- Open TensorFlow-related jobs
- Open Snowflake-related jobs
- Open NLP-related jobs
- Open Data governance-related jobs
- Open Data warehouse-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open Databricks-related jobs
- Open LLMs-related jobs
- Open DevOps-related jobs
- Open CI/CD-related jobs