Senior Data Scientist - GenAI
Hyderabad RSS
Roche
As a pioneer in healthcare, we have been committed to improving lives since the company was founded in 1896 in Basel, Switzerland. Today, Roche creates innovative medicines and diagnostic tests that help millions of patients globally.Roche fosters diversity, equity and inclusion, representing the communities we serve. When dealing with healthcare on a global scale, diversity is an essential ingredient to success. We believe that inclusion is key to understanding people’s varied healthcare needs. Together, we embrace individuality and share a passion for exceptional care. Join Roche, where every voice matters.
The Position
Senior NLP Data Scientist - GenAI (d/f/m)
Roche India – Roche Services & Solutions
Hyderbad
A healthier future. It’s what drives us to innovate. To continuously advance science and ensure everyone has access to the healthcare they need today and for generations to come. Creating a world where we all have more time with the people we love.
That’s what makes us Roche.
Roche has established Global Analytics and Technology Center of Excellence (GATE), part of Roche Services and Solution India - to drive analytics & technology driven solutions by partnering with Roche affiliates across the globe. To know more about us, visit https://www.rocheindia.com/en/About_Roche/GATE.html
As the Senior NLP Data Scientist, specializing in Generative AI, would lead development of a business solution through advanced NLP techniques in an Agile delivery mode, working directly with stakeholders in the business teams from Roche affiliates across the globe.
Your Opportunity
Responsible to work independently and/or lead team of Data Scientists to design and develop pipelines for develop and implement enterprise-level GenAI models and tools to solve business problems
Responsible to work with stakeholders throughout the organization to identify opportunities for leveraging company data to drive business solutions
Conduct research to identify emerging trends and technologies in Gen AI and other areas of data science
Build toolsets and re-usable components for our future projects and ideas
Automate the AI solution to achieve maximum efficiency
Write modularised, production ready code that is easily scalable across data volume and business functions
Manage project delivery with your Manager through regular meetings, extensive documentation and clear timeline setting
Develop processes and tools to monitor and analyze model performance and data accuracy
Train junior team members on NLP and ML techniques
Who you are
You holds a bachelor degree B Tech /BE specialization in Computer Science preferred and 5-8 years of experience in the data science / ML space is necessary. Preferably, you have a Master’s Degree – Data Science, Machine Learning. certifications on AI/ML/Data science would be an added advantage
Knowledge of US/Europe pharmaceutical market and experience with pharmaceutical data would be a plus, but not a must
Knowledge and experience in Generative AI. Knowing Docker, Kubernetes and cloud platforms (especially AWS) is added advantage
Knowledge from area of embedding representations and deep learning based solutions for NLP (e.g. word2vec, Gensim)
Understanding of basic concepts from area of Text Mining, NLP and NLU, hands-on experience with regular expressions
Prompt Engineering: be able to craft effective prompts and instructions to guide models towards desired outputs, ensuring coherence, factual accuracy, and alignment with ethical considerations
Model Optimization: Leverage techniques like RAG, Fine tuning to optimize models
Experience with modern deep learning architectures for NLP (encoder-decoder, transformers, attention), including:
hands-on experience with using transformers like Huggingface, SBert, GPT-2/GPT-3
capability to build ML/DL pipeline for training/tuning model (including transfer learning)
At least 3 years of experience with our typical NLP tools used in daily work:
scikit learn, numpy, pandas
pytorch or tensorflow
spaCy, NLTK
3+ years of general experience in NLP/AI software engineering, especially -
proficiency with Python a must and with PySpark will be good to have
experience with Git, Gitlab
knowing Jira, CI/CD tools is nice to have
understanding of software testing (unit tests, integration tests, smoke tests)
good software engineering practices, design patterns
bash/shell scripting is nice to have
experience with Docker, API development is a plus
experience with cloud platforms (preferred AWS)
Excellent written and verbal communication skills for coordinating across teams and ability to deliver fully working products (deployment activities)
Possess strong analytic mindset and logical thinking capability, strong QC mindset and general understanding of AGILE Project Delivery process
Demonstrates consulting, creativity, critical thinking, project planning, and attention to detail capabilities
Who we are
At Roche, more than 100,000 people across 100 countries are pushing back the frontiers of healthcare. Working together, we’ve become one of the world’s leading research-focused healthcare groups. Our success is built on innovation, curiosity and diversity.
Roche is an Equal Opportunity Employer.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile API Development APIs Architecture AWS CI/CD Computer Science Consulting Deep Learning Docker Engineering Generative AI Git GitLab GPT GPT-2 GPT-3 HuggingFace Jira Kubernetes Machine Learning NLP NLTK NLU NumPy Pandas Pharma Pipelines Prompt engineering PySpark Python PyTorch Research Scikit-learn Shell scripting spaCy TensorFlow Testing Transformers Word2Vec
Perks/benefits: Career development
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Marketing Data Analyst jobs
- Open Data Science Manager jobs
- Open MLOps Engineer jobs
- Open Data Engineer II jobs
- Open Senior Business Intelligence Analyst jobs
- Open Principal Data Engineer jobs
- Open Data Manager jobs
- Open Power BI Developer jobs
- Open Data Scientist II jobs
- Open Junior Data Scientist jobs
- Open Sr Data Engineer jobs
- Open Business Data Analyst jobs
- Open Data Analytics Engineer jobs
- Open Business Intelligence Developer jobs
- Open Product Data Analyst jobs
- Open Data Analyst Intern jobs
- Open Sr. Data Scientist jobs
- Open Senior Data Architect jobs
- Open Big Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Principal Data Scientist jobs
- Open Azure Data Engineer jobs
- Open Data Quality Analyst jobs
- Open Research Scientist jobs
- Open Data Product Manager jobs
- Open Data quality-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open Business Intelligence-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open PhD-related jobs
- Open Deep Learning-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open PyTorch-related jobs
- Open TensorFlow-related jobs
- Open APIs-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open LLMs-related jobs
- Open Generative AI-related jobs
- Open CI/CD-related jobs
- Open Kubernetes-related jobs
- Open Hadoop-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Databricks-related jobs