Data Engineer (SQL/ETL/Cloud) (Greece)
Athens, Attica, Greece
Applications have closed
Causaly
Causaly is the fastest way to find evidence, explore hidden connections and make new predictions in biomedical scienceAbout us
Causaly accelerates how humans acquire knowledge and develop insights in Biomedicine. We enable researchers and decision-makers to discover evidence from millions of academic publications, clinical trials, regulatory documents, patents and other data sources... in minutes. Using our AI technology, we are developing the world’s biggest knowledge platform in Biomedicine powered by a high-precision Knowledge Graph.
We work with some of the world's largest biopharma companies and institutions on use cases spanning Drug Discovery, Safety and Competitive Intelligence. For example, read how Causaly is used in Target Identification here: AI-supported-target-identification-for-systemic-lupus-erythematosus.
We are backed by top VCs including Index Ventures, Pentech and Marathon.
What we are looking for
We are looking for a Data Engineer to join our AI team. In this role, the lead data engineer will:
- Gather and understand data based on business requirements.
- Import big data (millions of records) from various formats (e.g. CSV, XML, SQL, JSON) to BigQuery.
- Process data on BigQuery using SQL, i.e. sanitize fields, aggregate records, combine with external data sources.
- Implement and maintain highly performant data pipelines with the industry’s best practices and technologies for scalability, fault tolerance and reliability.
- Build the necessary tools for monitoring, auditing, exporting and gleaning insights from our data pipelines
- Work with multiple stakeholders including software, machine learning, NLP and knowledge engineers, data curation specialists, and product owners to ensure all teams have a good understanding of the data and are using them in the right way.
- Manage backend data processes related to data delivery, curation and machine learning operations
Requirements
Minimum Requirements
At a bare minimum, successful candidates will have:
- Master’s degree in Computer Science, Mathematics or a related technical field
- 3+ years experience in backend data processing and data pipelines
- Excellent knowledge of Python and related libraries for working with data (e.g. pandas, Airflow)
- Excellent SQL and database skills
- Solid understanding of modern software development practices (testing, version control, documentation, etc…)
- Excellent knowledge of data processing principles
- A product and user-centric mindset
- Proficiency in Git version control
- Excellent problem solving, ownership, organizational skills, high attention to detail and quality
- Excellent knowledge of verbal and written English
Bonus Requirements
Any of the following requirements will be considered a plus:
- Experience with NoSQL and big data technologies (e.g. Spark, Hadoop)
- Experience working with Full-Text search databases, such as ElasticSearch
- Experience with Knowledge Graphs and graph databases, such as Neo4J
- Experience with MLOps and machine learning in production
- Knowledge of Terraform, Kubernetes and or/Docker Containers
- Experience with cloud computing providers (especially GCP or AWS)
- UNIX scripting skills
- Experience with TypeScript or JavaScript
Benefits
- Competitive Salary (see below)
- Hybrid working (home + office)
- Apple or Dell equipment (based on your OS preference)
- Annual training budget for professional development (e.g. books, video tutorials)
- Excessive sick-leave package
- Plenty of opportunity to take on more responsibility as we grow
- Be part of a multinational, diverse and exceptional team to build a transformative knowledge product that has real impact
- Annual team retreat to secret destination
The salary we offer is based on skills, professional experience and team fit. The applicant will be required to prove their qualifications via interview and written assignment.
Causaly welcomes applications from all backgrounds. We are committed to diversity regardless of gender, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation or gender identity. If you have a disability or additional need that requires accommodation, please do not hesitate to let us know.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow AWS Big Data BigQuery Computer Science CSV Data pipelines Docker Drug discovery Elasticsearch ETL GCP Git Hadoop JavaScript JSON Kubernetes Machine Learning Mathematics MLOps Neo4j NLP NoSQL Pandas Pipelines Python Spark SQL Terraform Testing TypeScript XML
Perks/benefits: Career development Competitive pay Salary bonus
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Lead Data Analyst jobs
- Open MLOps Engineer jobs
- Open Data Science Manager jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Engineer II jobs
- Open Sr Data Engineer jobs
- Open Data Manager jobs
- Open Principal Data Engineer jobs
- Open Data Analytics Engineer jobs
- Open Power BI Developer jobs
- Open Junior Data Scientist jobs
- Open Business Intelligence Developer jobs
- Open Data Scientist II jobs
- Open Senior Data Architect jobs
- Open Product Data Analyst jobs
- Open Sr. Data Scientist jobs
- Open Business Data Analyst jobs
- Open Manager, Data Engineering jobs
- Open Big Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Data Quality Analyst jobs
- Open Principal Data Scientist jobs
- Open Data Product Manager jobs
- Open Azure Data Engineer jobs
- Open Junior Data Engineer jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open ML models-related jobs
- Open GCP-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Java-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open TensorFlow-related jobs
- Open PhD-related jobs
- Open CI/CD-related jobs
- Open NLP-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open Databricks-related jobs
- Open LLMs-related jobs
- Open DevOps-related jobs