Palantir Data Engineer | 9 to 15 years | Pan India

Bengaluru, KA, IN

Capgemini

A global leader in consulting, technology services and digital transformation, we offer an array of integrated services combining technology with deep sector expertise.

View company page

Job Description

  • Data Ingestion into Foundry from external data sources /legacy systems using using Agents/ Magritte connectors, Data Connection. Working with Raw files.
  • Excellent proficiency in data processing scripting languages like but not limited to  Python, Pyspark, sql
  • Design, create and maintain a optimal data pipeline architecture in foundry
  • Ability to create data-pipelines and optimize data pipelines using: Pyspark for back-end, Typescript for front-end. Publishing and Using shared libraries in code repository
  • Assemble large, complex data sets that meet functional / non-functional business requirements in foundry.
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc in foundry
  • Palantir scheduling jobs for pipeline. Monitoring Data pipeline health and configuring health checks and alerts.(Data expectations)
  • Build analytics tools using Contour, Quiver, Workshop Application, Slate that utilize the data pipeline to provide actionable insights into KPIs like customer acquisition, operational efficiency and other key business performance metrics
  • Good Understanding and working knowledge on Foundry Tools: Ontology, Contour, Object-explorer, Ontology-Manager, Object-editor using Actions/ Typescript, Code workbook, Code Repository, Foundry ML

Primary Skills

  • candidate must have 5+ years of experience in a Data Engineer role, Should have experience using the following software/tools:
  • Hadoop, Spark, Kafka, etc.
  • Experience with relational SQL and NoSQL databases, including Postgres and Cassandra/Mongo dB
  • Experience with stream-processing systems: Storm, Spark-Streaming, etc.
  • Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
  • Advanced working SQL knowledge and able to quickly envision a technical solution based on functional requirements: At least 4+ years in sql
  • Experience building and optimizing ‘big data’ data pipelines, architectures and data sets : At least 5+ years in Pyspark/ Python
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement
     

Secondary Skills

  • Strong analytic skills related to working with big datasets.
  • Build processes supporting data transformation, data structures, metadata, dependency and workload management.
  • Experience supporting and working with cross-functional teams in a dynamic environment.
     
Apply now Apply later
  • Share this job via
  • or

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Architecture Big Data Cassandra Data pipelines Hadoop Java Kafka KPIs Machine Learning NoSQL Pipelines PostgreSQL PySpark Python Scala Spark SQL Streaming TypeScript

Region: Asia/Pacific
Country: India
Job stats:  3  0  0
Category: Engineering Jobs

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.