Senior Data Engineer
Warsaw, Masovian Voivodeship, Poland - Remote
Intellectsoft
Discover excellence with Intellectsoft, a software development company shaping digital innovation since 2007. Elevate your business with our expert services.We are Intellectsoft - a digital transformation consultancy group and engineering company that delivers cutting-edge solutions for global organisations and technology startups. Since 2007, we have been helping companies and established brands reimagine their business through digitalization. We're looking for an exceptional Senior Data Engineer to join our team. Are you up for a challenge?
Project description:
A biotechnology research project that earned recognition as a trusted provider of clinical genetic testing and an ideal collaborator for developing precision medicine solutions. Patients are central to the project's mission, driving partnerships with numerous non-profit organizations to support patients and healthcare professionals in their quest for answers. The main focus - is Clinical Cancer Diagnostic; however, the project is designed to expand into additional areas such as Molecular Diagnostic Testing, Cardio Genetics, Neuro Genetics, etc
Responsibilities:
- Develop connectors for Kafka to streamline the syncing of updates from source data repositories.
- Establish partitioned Kafka topics to optimize the synchronization of updates to destination data marts.
- Leverage Apache Flink for crafting intricate data analytics workloads to enable real-time monitoring and transformations.
- Deploy dashboards using Datadog and Cloudwatch to uphold system health and fulfil user needs.
- Institute schema registries to uphold data governance standards while catering to diverse data requirements.
- Collaborate closely with a West Coast-based scrum team, contributing to daily pull request submissions, code reviews, documentation maintenance, backlog management, and build validation across environments within sprint cycles lasting 2-4 weeks.
- Coordinate with other scrum teams to ensure coherence on data contracts, API specifications, and deployment timelines.
- Architect database schemas with a focus on query access patterns.
- Establish and manage CI/CD pipelines using infrastructure-as-code principles.
- Transition on-premises ETL jobs from PHP to AWS Flink and Glue processes gradually.
- Work alongside QA Engineers to develop automated test suites.
- Engage with end-users to troubleshoot service interruptions and champion our data product offerings.
- Maintain vigilant oversight of data quality, promptly addressing discrepancies, latency issues, and defects.
Requirements
Must have:
- Proficiency in Apache Kafka (preferably MSK flavour), Debezium, Python, Apache Flink or PySpark Streaming, MySQL (preferably RDS flavours), CDK or Terraform, Athena, Glue, Lambda, Appflow, HANA/4, PHP, Redis, Docker, and JavaScript.
- At least 6 years of hands-on experience collaborating within professional scrum teams or equivalent educational background.
- A minimum of 3 years of practical experience in designing and indexing relational databases.
- At least 2 years of practical experience in constructing and managing real-time data streams.
- Demonstrated proficiency with at least 1 year of experience in developing monitoring dashboards.
Nice to have:
- A Master’s degree in computer science, data science, mathematics, or life sciences.
- Demonstrate a foundational grasp of genomic concepts and terminology.
- Exhibit flexibility in availability.
- Demonstrate experience in constructing data APIs and providing Data as a Service.
- Showcase proficiency in integrating with SaaS platforms like SAP and Salesforce.
- Display familiarity with PHP MVC frameworks such as Symfony or express readiness to acquire such skills.
- Knowledge of Atlassian products, including Jira, Confluence, and Bamboo.
- Show proficiency in utilising system diagramming tools like Miro, LucidCharts, or Visio.
Benefits
- 36 paid absence days per year for the work-life balance of each specialist + 1 additional day for each following year of cooperation with the company
- Up to 10 unused absence days can be added to income after 12 months of cooperation
- Health insurance compensation
- Depreciation coverage for personal laptop usage for project needs
- Udemy courses of your choice
- Regular soft-skills training
- Excellence Сenters meetups
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: APIs Athena AWS CI/CD Computer Science Confluence Data Analytics Data governance Data quality Docker Engineering ETL Flink JavaScript Jira Kafka Lambda Mathematics MySQL PHP Pipelines PySpark Python RDBMS Research Salesforce Scrum Streaming Terraform Testing
Perks/benefits: Gear Health care
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Data Science Manager jobs
- Open Lead Data Analyst jobs
- Open Data Manager jobs
- Open Data Engineer II jobs
- Open Senior Business Intelligence Analyst jobs
- Open MLOps Engineer jobs
- Open Principal Data Engineer jobs
- Open Power BI Developer jobs
- Open Data Scientist II jobs
- Open Data Analytics Engineer jobs
- Open Business Intelligence Developer jobs
- Open Junior Data Scientist jobs
- Open Business Data Analyst jobs
- Open Sr Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Product Data Analyst jobs
- Open Sr. Data Scientist jobs
- Open Senior Data Architect jobs
- Open Big Data Engineer jobs
- Open Azure Data Engineer jobs
- Open Principal Data Scientist jobs
- Open Research Scientist jobs
- Open Data Quality Analyst jobs
- Open Manager, Data Engineering jobs
- Open Junior Data Engineer jobs
- Open Data quality-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open Business Intelligence-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open PhD-related jobs
- Open Deep Learning-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open PyTorch-related jobs
- Open TensorFlow-related jobs
- Open APIs-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open LLMs-related jobs
- Open Snowflake-related jobs
- Open CI/CD-related jobs
- Open Generative AI-related jobs
- Open Kubernetes-related jobs
- Open Hadoop-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Docker-related jobs