Data Engineer- Gen AI - Senior Associate

Bengaluru (SDC) - Bagmane Tech Park

PwC

We are a community of solvers combining human ingenuity, experience and technology innovation to help organisations build trust and deliver sustained outcomes.

View all jobs at PwC

Apply now Apply later

Line of Service

Advisory

Industry/Sector

Not Applicable

Specialism

Data, Analytics & AI

Management Level

Senior Associate

Job Description & Summary

A career within Data and Analytics services will provide you with the opportunity to help organisations uncover enterprise insights and drive business results using smarter data analytics. We focus on a collection of organisational technology capabilities, including business intelligence, data management, and data assurance that help our clients drive innovation, growth, and change within their organisations in order to keep up with the changing nature of customers and technology. We make impactful decisions by mixing mind and machine to leverage data, understand and navigate risk, and help our clients gain a competitive edge.

As part of our Analytics and Insights Consumption team, you’ll analyze data to drive useful insights for clients to address core business issues or to drive strategic outcomes. You'll use visualization, statistical and analytics models, AI/ML techniques, Modelops and other techniques to develop these insights.

To really stand out and make us fit for the future in a constantly changing world, each and every one of us at PwC needs to be a purpose-led and values-driven leader at every level. To help us achieve this we have the PwC Professional; our global leadership development framework. It gives us a single set of expectations across our lines, geographies and career paths, and provides transparency on the skills we need as individuals to be successful and progress in our careers, now and in the future.

As a Senior Associate, you'll work as part of a team of problem solvers, helping to solve complex business issues from strategy to execution. PwC Professional skills and responsibilities for this management level include but are not limited to:

  • Use feedback and reflection to develop self awareness, personal strengths and address development areas.
  • Delegate to others to provide stretch opportunities, coaching them to deliver results.
  • Demonstrate critical thinking and the ability to bring order to unstructured problems.
  • Use a broad range of tools and techniques to extract insights from current industry or sector trends.
  • Review your work and that of others for quality, accuracy and relevance.
  • Know how and when to use tools available for a given situation and can explain the reasons for this choice.
  • Seek and embrace opportunities which give exposure to different situations, environments and perspectives.
  • Use straightforward communication, in a structured way, when influencing and connecting with others.
  • Able to read situations and modify behavior to build quality relationships.
  • Uphold the firm's code of ethics and business conduct.

Responsibilities:

- Design, develop, and maintain data pipelines and ETL processes for GenAI projects.

- Collaborate with data scientists and software engineers to implement machine learning models and algorithms.

- Optimize data infrastructure and storage solutions to ensure efficient and scalable data processing.

- Implement event-driven architectures to enable real-time data processing and analysis.

- Utilize containerization technologies like Kubernetes and Docker for efficient deployment and scalability.

- Develop and maintain data lakes for storing and managing large volumes of structured and unstructured data.

- Implement and integrate LLM frameworks (Langchain, Semantic Kernel) for advanced language processing and analysis.

- Collaborate with cross-functional teams to design and implement solution architectures for GenAI projects.

- Utilize cloud computing platforms such as Azure or AWS for data processing, storage, and deployment.

- Monitor and troubleshoot data pipelines and systems to ensure smooth and uninterrupted data flow.

- Stay up-to-date with the latest advancements in GenAI technologies and recommend innovative solutions to enhance data engineering processes.

- Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions.

- Document data engineering processes, methodologies, and best practices.

- Maintain solution architecture certificates and stay current with industry best practices.

Requirements:

  • Python Proficiency: Minimum 3 years of hands-on experience building applications with Python.

  • Scalable System Design: Solid understanding of designing and architecting scalable Python applications, particularly for Gen AI use cases, with a strong understanding of various components and systems architecture patterns to make cohesive and decoupled, scalable applications.

  • Web Frameworks: Familiarity with Python web frameworks (Flask, FastAPI) for building web applications around AI models.

  • Modular Design & Security: Demonstrated ability to design applications with modularity, reusability, and security best practices in mind (session management, vulnerability prevention, etc.,).

  • Cloud-Native Development: Familiarity with cloud-native development patterns and tools (e.g., REST APIs, microservices, serverless functions).

  • Cloud Deployments: Experience deploying and managing containerized applications on Azure/AWS (Azure Kubernetes Service, Azure Container Instances, or similar).

  • Version Control (Git):  Strong proficiency in Git for effective code collaboration and management.

  • CI/CD: Knowledge of continuous integration and deployment (CI/CD) practices on cloud platforms.

  • 3-5 years of relevant technical/technology experience, with a focus on GenAI projects.

  • Strong programming skills in Python.

  • Experience with data processing frameworks like Apache Spark or similar.

  • Proficiency in SQL and database management systems.

Preferred Skills:

  • Gen AI Frameworks:  Experience with LLM frameworks or tools for interacting with LLMs such as LangChain, Semantic Kernel, LlamaIndex

  • Data Pipelines: Experience in setting up data pipelines for model training and real-time inference.

Education (if blank, degree and/or field of study not specified)

Degrees/Field of Study required:

Degrees/Field of Study preferred:

Certifications (if blank, certifications not specified)

Required Skills

Optional Skills

Desired Languages (If blank, desired languages not specified)

Travel Requirements

Not Specified

Available for Work Visa Sponsorship?

No

Government Clearance Required?

No

Job Posting End Date

Apply now Apply later
  • Share this job via
  • or

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  1  0  0

Tags: APIs Architecture AWS Azure Business Intelligence CI/CD Data Analytics Data management Data pipelines Docker Engineering ETL FastAPI Flask Generative AI Git Kubernetes LangChain LLMs Machine Learning Microservices ML models Model training Pipelines Python Security Spark SQL Statistics Unstructured data

Perks/benefits: Career development Transparency

Region: Asia/Pacific
Country: India

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.