Palantir Data Engineer | 9 to 15 years | Pan India
Bengaluru, KA, IN
Capgemini
A global leader in consulting, technology services and digital transformation, we offer an array of integrated services combining technology with deep sector expertise.Job Description
- Data Ingestion into Foundry from external data sources /legacy systems using using Agents/ Magritte connectors, Data Connection. Working with Raw files.
- Excellent proficiency in data processing scripting languages like but not limited to Python, Pyspark, sql
- Design, create and maintain a optimal data pipeline architecture in foundry
- Ability to create data-pipelines and optimize data pipelines using: Pyspark for back-end, Typescript for front-end. Publishing and Using shared libraries in code repository
- Assemble large, complex data sets that meet functional / non-functional business requirements in foundry.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc in foundry
- Palantir scheduling jobs for pipeline. Monitoring Data pipeline health and configuring health checks and alerts.(Data expectations)
- Build analytics tools using Contour, Quiver, Workshop Application, Slate that utilize the data pipeline to provide actionable insights into KPIs like customer acquisition, operational efficiency and other key business performance metrics
- Good Understanding and working knowledge on Foundry Tools: Ontology, Contour, Object-explorer, Ontology-Manager, Object-editor using Actions/ Typescript, Code workbook, Code Repository, Foundry ML
Primary Skills
- candidate must have 5+ years of experience in a Data Engineer role, Should have experience using the following software/tools:
- Hadoop, Spark, Kafka, etc.
- Experience with relational SQL and NoSQL databases, including Postgres and Cassandra/Mongo dB
- Experience with stream-processing systems: Storm, Spark-Streaming, etc.
- Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
- Advanced working SQL knowledge and able to quickly envision a technical solution based on functional requirements: At least 4+ years in sql
- Experience building and optimizing ‘big data’ data pipelines, architectures and data sets : At least 5+ years in Pyspark/ Python
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement
Secondary Skills
- Strong analytic skills related to working with big datasets.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- Experience supporting and working with cross-functional teams in a dynamic environment.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Architecture Big Data Cassandra Data pipelines Hadoop Java Kafka KPIs Machine Learning NoSQL Pipelines PostgreSQL PySpark Python Scala Spark SQL Streaming TypeScript
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Data Engineer II jobs
- Open Data Scientist II jobs
- Open Business Intelligence Developer jobs
- Open Research Scientist jobs
- Open Business Intelligence Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Business Data Analyst jobs
- Open Principal Data Scientist jobs
- Open Sr Data Engineer jobs
- Open Data Science Manager jobs
- Open Data Science Intern jobs
- Open Lead Data Analyst jobs
- Open Sr. Data Scientist jobs
- Open Junior Data Scientist jobs
- Open Marketing Data Analyst jobs
- Open MLOps Engineer jobs
- Open Manager, Data Engineering jobs
- Open Azure Data Engineer jobs
- Open Data Analytics Engineer jobs
- Open Data Analyst II jobs
- Open Software Engineer, Machine Learning jobs
- Open Product Data Analyst jobs
- Open Data Quality Analyst jobs
- Open Data Engineer III jobs
- Open ETL Developer jobs
- Open Data quality-related jobs
- Open Data management-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open Business Intelligence-related jobs
- Open ML models-related jobs
- Open Privacy-related jobs
- Open Data visualization-related jobs
- Open Finance-related jobs
- Open PhD-related jobs
- Open PyTorch-related jobs
- Open APIs-related jobs
- Open TensorFlow-related jobs
- Open Deep Learning-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open LLMs-related jobs
- Open Data governance-related jobs
- Open Hadoop-related jobs
- Open Snowflake-related jobs
- Open Generative AI-related jobs
- Open CI/CD-related jobs
- Open Databricks-related jobs
- Open Airflow-related jobs
- Open RDBMS-related jobs