Data Engineer
Porto, Porto District, Portugal
Applications have closed
Defined.ai
Defined.ai: Dive into the largest AI training data marketplace. Explore smart data for ethical AI and seamlessly buy, sell, or commission top-quality training datasets.Who is DefinedCrowd? Well, from a technical point of view, we leverage the power of a global crowd to provide some of the world’s biggest companies with the high-quality data they need to power their artificial intelligence. We’re instrumental to the progression and development of artificial intelligence and we couldn’t be prouder or more inspired to be involved in an industry that is changing the world.
From a personal point of view, we’re a group of big thinkers, high achievers and creative problem solvers. We bond over our shared love of software engineering, data science, and strong coffee. We like online gaming, running marathons, and team drinks. We celebrate authenticity and diversity and we’re invested in what we do. Our mission? World domination, obviously!
What will you do?
- Ensure data reliability, efficiency and quality provided to end users
- Be responsible for developing reliable and easy to modify data processing pipelines
- Awareness of other departments needs in terms of data
- Adapt and scale our data infrastructure as our operational services grow
- Build and improve infrastructure for data science, business intelligence, analytics and other stakeholders
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery and re-designing infrastructure for greater scalability
Requirements
Who are we looking for?
Do you have the drive to work in an innovative and ambitious environment?
We’re looking for someone with a determined and proactive mindset, someone inspired and passionate to help us achieve our goals. Our successful candidate is a strong critical thinker, reliable and transparent, with an ability to learn and communicate. We’re looking for someone special to contribute to our unique culture.
Our Data Engineers have:
- Background in the Computer Science or Engineering field or related
- Software development experience in one or more of the following languages: Java, Python, C#
- Understanding of distributed systems
- Capable of designing scalable and data-intensive architectures
- Knowledgeable of the most common trade-offs in data storage and retrieval
- Experience with large-scale computation frameworks (e.g., Spark)
- Experience with Kafka and/or RabbitMQ
- Experience with Azure Data Factory and Synapse
- Experience with Azure Data Lake Storage and Airflow.
- Experience with Microsoft PowerBI or Apache Superset
- Experience with SQL and/or NoSQL databases.
- DevOps skills: CI/CD pipelines, Bash Scripting, Docker, Kubernetes
Benefits
Why Join Us?
You spend a lot of your time at work, so it should be challenging, fun and interesting. At DefinedCrowd, it will be all of those things and more. Here’s what we offer:
- A unique culture, healthy working environment, and a flexible working schedule
- Excellent career development opportunities in a high growth company
- Access to an excellent compensation and benefits package
- An international and diverse team, representing more than 30 nationalities at our 4 locations
- Global mobility and relocation support with the help of our specialized team
- Continuous training opportunities leveraging hand-on workshops and formal development opportunities
- The possibility to join our offices either in Lisbon or in Porto
About Us
DefinedCrowd offers a platform with multiple data delivery options that leverages machine learning technology and human intelligence to deliver quality-guaranteed training data for AI systems. The platform offers self-service and fully customizable solutions that deliver high-quality project-specific training data, enabling AI products reach market quicker. It is this business model that has allowed DefinedCrowd to raise a total of $63.6M in funding over 4 rounds. Our value proposition is quality, privacy, speed and scale, covering more than 50 different languages. With strong expertise in speech and natural language processing technologies, we have been serving AI companies and Fortune 500 companies since day one. DefinedCrowd was founded in Seattle and has offices in Lisbon, Porto and Tokyo.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow Azure Business Intelligence CI/CD Computer Science DevOps Distributed Systems Docker Engineering Kafka Kubernetes Machine Learning NLP NoSQL Pipelines Power BI Python Spark SQL
Perks/benefits: Career development Flex hours Relocation support Startup environment
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open AI Engineer jobs
- Open Lead Data Analyst jobs
- Open MLOps Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Engineer II jobs
- Open Sr Data Engineer jobs
- Open Data Manager jobs
- Open Principal Data Engineer jobs
- Open Power BI Developer jobs
- Open Data Analytics Engineer jobs
- Open Junior Data Scientist jobs
- Open Business Intelligence Developer jobs
- Open Product Data Analyst jobs
- Open Senior Data Architect jobs
- Open Data Scientist II jobs
- Open Sr. Data Scientist jobs
- Open Manager, Data Engineering jobs
- Open Business Data Analyst jobs
- Open Big Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Data Quality Analyst jobs
- Open Principal Data Scientist jobs
- Open Data Product Manager jobs
- Open Junior Data Engineer jobs
- Open ETL Developer jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open GCP-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Java-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open TensorFlow-related jobs
- Open PhD-related jobs
- Open CI/CD-related jobs
- Open NLP-related jobs
- Open Data governance-related jobs
- Open Kubernetes-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open Databricks-related jobs
- Open LLMs-related jobs
- Open DevOps-related jobs