Data Scientist
India - Remote
Fusemachines
Unleash your AI Transformation with AI Products and AI Solutions.About Fusemachines
Fusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to democratize AI and harness the power of global AI talent from underserved communities. With a robust presence in four countries and a dedicated team of over 400 full-time employees, we are committed to fostering AI transformation journeys for businesses worldwide. At Fusemachines, we not only bridge the gap between AI advancement and its global impact but also strive to deliver the most advanced technology solutions to the world.
About the Role:
We are seeking a Data Scientist with hands-on Python experience and proven abilities to support software activities in an Agile software development lifecycle. We are seeking a well-rounded developer to lead a cloud-based big data application using a variety of technologies.
The ideal candidate will possess strong technical, analytical, and interpersonal skills. In addition, the candidate will lead developers on the team to achieve architecture and design objectives as agreed with stakeholders.
This is a remote, contract-based role.
Responsibilities:
- Work with developers on the team to meet product deliverables.
- Coach developers on the team to develop a scalable implementation.
- Must have the ability to convert legacy SAS and SPSS code to Python or R-code.
- Work independently and collaboratively on a multi-disciplined project team in an Agile development environment.
- Contribute detailed design and architectural discussions as well as customer requirements sessions to support the implementation of code and procedures for our big data product.
- Design and develop clear and maintainable code with automated open-source test functions such as Pytest, unit test, etc.
- Lead developers on the team to meet product deliverables.
- Ability to identify and solve for code/design optimization.
- Learn and integrate with a variety of systems, APIs, and platforms.
- Interact with a multi-disciplined team to clarify, analyze, and assess requirements.
- Be actively involved in the design, development, and testing activities in big data applications.
Requirements:
- Minimum of 3+ years of hands-on experience in Python and Pyspark, Jupyter Notebooks, Python environment controllers such as Poetry or PipEnv.
- The ability to convert SAS and SPSS to Python.
- Ability and desire to learn Julia and R-Code to convert legacy programs to Python and Spark for maintainability.
- Familiarity with Databricks. Azure Databricks is a plus.
- Familiarity with data cleansing, transformation, and validation.
- Proven technical leadership on prior development projects.
- Hands-on experience with a code versioning tool such as GitHub, Azure DevOps, Bitbucket, etc.
- Hands-on experience building pipelines in GitHub (or Azure DevOps, Jenkins, etc.)
- Hands-on experience with Spark.
- Hands-on experience using Relational Databases, such as Oracle, SQL Server, MySQL, Postgres or similar.
- Experience using Markdown to document code in repositories or automated documentation tools like PyDoc.
- Strong written and verbal communication skills.
- Self-motivated and able to work well in a team.
Nice to Have:
- Experience with Large Language Models (LLM).
- Experience with data visualization tools such as Power BI or Tableau.
- Experience with DevOps CI/CD tools and automation processes (e.g., Azure DevOps, GitHub, BitBucket).
- Containers and their environments (Docker, Podman, Docker-Compose, Kubernetes, Minikube, Kind, etc.)
- Experience with Azure Cloud Services and Azure Data Factory.
Education:
Bachelor of Science degree from an accredited university
Fusemachines is an Equal Opportunities Employer, committed to diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or any other characteristic protected by applicable federal, state, or local laws.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile APIs Architecture Azure Big Data Bitbucket CI/CD Databricks Data visualization DevOps Docker GitHub Jenkins Julia Jupyter Kubernetes LLMs MySQL Open Source Oracle Pipelines PostgreSQL Power BI PySpark Python R RDBMS SAS Spark SPSS SQL Tableau Testing
Perks/benefits: Team events
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Data Science Manager jobs
- Open Research Scientist jobs
- Open Junior Data Analyst jobs
- Open Business Data Analyst jobs
- Open Principal Data Scientist jobs
- Open Data Scientist II jobs
- Open Sr Data Engineer jobs
- Open BI Analyst jobs
- Open Business Intelligence Engineer jobs
- Open Data Science Intern jobs
- Open Sr. Data Scientist jobs
- Open Lead Data Analyst jobs
- Open Senior Business Intelligence Analyst jobs
- Open Azure Data Engineer jobs
- Open Software Engineer, Machine Learning jobs
- Open Junior Data Scientist jobs
- Open MLOps Engineer jobs
- Open Manager, Data Engineering jobs
- Open Marketing Data Analyst jobs
- Open Data Analytics Engineer jobs
- Open Data Engineer III jobs
- Open Junior Data Engineer jobs
- Open Data Engineering Manager jobs
- Open Product Data Analyst jobs
- Open Senior Software Engineer jobs
- Open Power BI-related jobs
- Open GCP-related jobs
- Open Tableau-related jobs
- Open Excel-related jobs
- Open ML models-related jobs
- Open Data pipelines-related jobs
- Open APIs-related jobs
- Open PhD-related jobs
- Open PyTorch-related jobs
- Open Finance-related jobs
- Open LLMs-related jobs
- Open TensorFlow-related jobs
- Open Deep Learning-related jobs
- Open Consulting-related jobs
- Open Data visualization-related jobs
- Open Generative AI-related jobs
- Open Business Intelligence-related jobs
- Open Data governance-related jobs
- Open CI/CD-related jobs
- Open NLP-related jobs
- Open DevOps-related jobs
- Open Kubernetes-related jobs
- Open Docker-related jobs
- Open Git-related jobs
- Open Snowflake-related jobs