Spark Solution Architect - Empower (remote/virtual - Canada-based)
Vancouver, BC, Canada
Hitachi Solutions
Company Description
Company Overview
Hitachi Solutions is a global solutions integrator passionate about designing, developing, and delivering cutting edge cloud solutions to help our clients innovative across their entire business. Our firm develops the business services and technology powering some of the products you use every day – and is closely aligned with Microsoft and other leaders in the cloud computing space.
What sets Hitachi Solutions apart is both our industry focus, and the intellectual property that we bring to our customers. Recognized for our achievements year after year, we strive to be the trusted advisor of large and medium sized enterprises alike – helping them move fast to achieve strategic business initiatives with distinguished engineering, hard work, and compassion. With over 3,000 team members across 14 countries, in our 18 years of focus our company has seen explosive growth and high customer satisfaction. This has allowed us to offer exceptionally compelling salaries, 401k match, family leave, and health benefits. And no – we will not make you come into an office or ask for an inflexible work schedule.
A part of Hitachi, Ltd., our company has a long and rich history of innovation, financial strength, and international presence of one of the world’s largest companies. Since 1910, Hitachi, Ltd. has been a leader in manufacturing innovative products and solutions that support industry and social infrastructure around the globe supported by 303,000 employees in over 100 countries and across 864 companies
Job Description
NEW PRODUCT DEVELOPMENT AND INNOVATIONS TEAM
This position is housed in our New Product Development team formed in 2021. Joining this team represents an opportunity to fast-track your career and to work with a team of fun and nerdy colleagues in a disruptive startup atmosphere: focused on hypergrowth, moving quickly, and making mistakes in the furtherance of innovation and sound engineering.
Armed with an existing book of business, and a stable financial parent – it is the goal of this group to transform our company into a billion-dollar product company, by focusing on engineering excellence and making the cloud easier for our customers.
Spark Solution Architect (Databricks, Python, Spark)
This is a full-time role on the Empower product team architecting Big Data solutions. Our Empower product is Platform-as-a-Service (PaaS) / Software-as-a-Service (SaaS) Datalakehouse and Business Intelligence, subscription-based, Intellectual Property.
Individuals in this role will architect complex data pipelines products that manage business critical operations, and large-scale analytics pipelines. Qualified applicants will have expert Spark data engineering expertise and have robust Python software engineering experience.
Responsibilities:
- Scope business problems and architect Big Data pipeline solutions – for structured, unstructured and live streaming data – in Spark and Databricks platforms
- Design complex data pipeline products which manage business-critical operations and large-scale analytics applications
- Utilize Airflow, Dbt, Data Factory, or similar DAG Tools for orchestration of robust data pipelines
- Support analytics, data science and/or engineering teams and understand their unique needs and challenges
- Design & POC integration of new features into proprietary Spark package(s)
- Partner with Product Management team to identify user stories and maintain prioritized backlog
- An owner of Empower's Spark repository; review & approve pull requests
- Enforce code standards: formatting, comments, documentation, unit tests, etc.
- Instill excellence into the processes, methodologies, standards, and technology choices embraced by the team
- Mentor developers in Spark and Python best practices
- Identify opportunities for continued improvement of existing proprietary Spark package(s)
- Dedicate time to continuous learning to keep the team appraised of the latest developments in the space
- Commitment to developing technical maturity across the company
Qualifications
Please note: Although our position is remote / virtual / work-from-home, you MUST reside, and be authorized to work, in Canada.
- 10+ years of Data Engineering expertise including 6+ years designing and building data pipelines for batch and streaming data is REQUIRED
- 6+ years of experience with Spark/PySpark is REQUIRED
- 4+ years of experience with Databricks is REQUIRED
- 4+ years of hands-on experience implementing Big Data solutions in a cloud ecosystem, including Data/Delta Lakes, is REQUIRED
- 2+ years of experience with DAG Tools (Data Factory, Airflow, Dbt or similar) is REQUIRED
- Azure cloud experience preferred; will consider AWS, GCP or other cloud platform experience in lieu of
- 2+ years of experience with Kafka or other live streaming technology is REQUIRED
- Experience with unit testing or data quality frameworks is REQUIRED
- 2+ years of experience with source control (git) on the command line is REQUIRED
- 5+ years of SQL experience, specifically writing complex, highly optimized queries across large volumes of data is REQUIRED
- Experience with CI/CD deployment pipelines
- Knowledge of software design patterns
#LI-CA1
#REMOTE
#DATABRICKS
#SPARK
#PYTHON
#DATALAKEHOUSE
Additional Information
We are an equal opportunity employer. All applicants will be considered for employment without attention to age, race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.
* Salary range is an estimate based on our salary survey 💰
Tags: Airflow AWS Azure Big Data Business Intelligence CI/CD Databricks Data pipelines Data quality Engineering GCP Git Kafka Pipelines PySpark Python Spark SQL Streaming Testing
Perks/benefits: 401(k) matching Career development Health care Startup environment
More jobs like this
Explore more AI/ML/Data Science career opportunities
Find open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general, filtered by job title or popular skill, toolset and products used.
- Open Power BI Developer jobs
- Open Junior Data Scientist jobs
- Open Data Engineer (Remote) jobs
- Open Data Analytics Engineer jobs
- Open Director, Data Engineering jobs
- Open Senior Data Analyst (Bangkok Based, relocation provided) jobs
- Open Staff Data Scientist jobs
- Open Junior Data Engineer jobs
- Open Marketing Data Analyst jobs
- Open Product Data Analyst jobs
- Open Lead Data Analyst jobs
- Open Head of Data Science jobs
- Open Principal Data Scientist jobs
- Open Big Data Engineer jobs
- Open Data Manager jobs
- Open BI Analyst jobs
- Open Computer Vision Engineer jobs
- Open Senior Data Architect jobs
- Open Machine Learning Scientist jobs
- Open Associate Data Analyst- Customer Experience Group | Bangkok-based jobs
- Open Data Analyst (Statistics/Python/BI) (Bangkok-based, relocation provided) jobs
- Open Cloud Data Engineer jobs
- Open Data Analyst, Partner Development - (Statistics/ML/BI) (Bangkok-based, relocation provided) jobs
- Open Sr Data Engineer jobs
- Open Senior Data Analyst, Partner Development - (Statistics/ML/BI) (Bangkok-based, relocation provided) jobs
- Open Power BI-related jobs
- Open Consulting-related jobs
- Open Business Intelligence-related jobs
- Open APIs-related jobs
- Open Data visualization-related jobs
- Open Hadoop-related jobs
- Open Data management-related jobs
- Open Data quality-related jobs
- Open ML models-related jobs
- Open Airflow-related jobs
- Open Finance-related jobs
- Open Privacy-related jobs
- Open Scala-related jobs
- Open Snowflake-related jobs
- Open Deep Learning-related jobs
- Open Kafka-related jobs
- Open Data warehouse-related jobs
- Open PhD-related jobs
- Open Streaming-related jobs
- Open Git-related jobs
- Open NoSQL-related jobs
- Open CI/CD-related jobs
- Open Docker-related jobs
- Open DevOps-related jobs
- Open Kubernetes-related jobs