Data Engineer
Lisbon, Lisbon, Portugal
Applications have closed
Defined.ai
Defined.ai: Dive into the largest AI training data marketplace. Explore smart data for ethical AI and seamlessly buy, sell, or commission top-quality training datasets.Who is Defined.ai? Well, from a technical point of view, we leverage the power of a global crowd to provide some of the world’s biggest companies with the high-quality data they need to power their artificial intelligence. We’re instrumental to the progression and development of artificial intelligence and we couldn’t be prouder or more inspired to be involved in an industry that is changing the world.
From a personal point of view, we’re a group of big thinkers, high achievers and creative problem solvers. We bond over our shared love of software engineering, data science, and strong coffee. We like online gaming, running marathons, and team drinks. We celebrate authenticity and diversity and we’re invested in what we do. Our mission? World domination, obviously!
Who are we looking for?
Do you have the drive to work in an innovative and ambitious environment?
We’re looking for someone with a determined and proactive mindset, someone inspired and passionate to help us achieve our goals. Our successful candidate is a strong critical thinker, reliable and transparent, with an ability to learn and communicate. We are looking for someone special to contribute to our unique culture.
What will you do?
- Design and implement scalable Python-based data pipelines.
- Make sure all data is readily available for processing by data scientists.
- Set software engineering tools, platforms, and best practices while performing trade-off analysis to best match engineering, product, and project constraints and expectations.
- Collaborate with other software engineering teams such as SREs and DevOps to achieve your team’s goals.
- Help the Product Manager in structuring, breaking down, and prioritizing the product roadmap into backlog work items.
- Work together with the Platform Engineering team and the Solution Developers team so as to trade off the development of new customer/project features in the platform or off the platform.
Requirements
- BSc or MSc in Computer Science or similar background.
- Advanced level of Python- and PySpark-based data pipelines and software quality best practices.
- Worked with Azure services and pipelines such as Synapse Analytics (mainly PySpark Jobs, Pipelines, and Notebooks), ADLS, Blob Storage, DevOps, and SQL and NoSQL databases.
- Happy with administering and configuring Azure DevOps Services for all resources his team use (Boards, CI/CD Pipelines, Repos, etc.).
- Comfortable with evaluating and applying software design and architectural patterns/principles.
- Knowledgeable of RESTful APIs, from the provider as well as the consumer point of view.
- Accustomed to working with data lake and data pipeline architectures.
- Understanding of Data Warehousing concepts.
- Worked with pipeline orchestration tools like Apache Airflow or Flyte.
- Comfortable with versioning systems, preferably Git.
- Proficient in both written and spoken English.
Nice to haves:
- Orchestration with Kubernetes and containerization with Docker.
- Experience with Hadoop.
- Worked in a team with an Agile methodology.
- Knowledge in one of these: Conversational AI (e.g., Rasa framework), DevOps or Speech engineering
Benefits
You spend a lot of your time at work, so it should be challenging, fun and interesting. At Defined.ai it will be all of those things and more. Here’s what we offer:
- Flexible working schedule and hybrid model. We know comfort can boost creativity and performance, so you can manage your schedule and work both from one of our modern office spaces or home.
- Excellent career development opportunities in a high growth company. With us, you can accomplish your career goals and follow a well-described career path with the support of your supervisor.
- Culture of feedback and continuous improvement. AI is a fast-paced area, so we keep track of tech trends, and we always ask for feedback.
- An international and diverse team. We have more than 30 nationalities at our 3 locations, and we provide language classes.
- Continuous training opportunities. You can choose from many options: leveraging hand-on workshops, unlimited access to Udemy and formal development opportunities.
- We love to have fun together. We joke a lot, and we can't imagine work without fun activities – we already surfed, raced carts and played soccer together.
About Us
Defined.ai offers a platform with multiple data delivery options that leverages machine learning technology and human intelligence to deliver quality-guaranteed training data for AI systems. The platform offers self-service and fully customizable solutions that deliver high-quality project-specific training data, enabling AI products reach market quicker. It is this business model that has allowed Defined.ai to raise a total of $63.6M in funding over 4 rounds. Our value proposition is quality, privacy, speed and scale, covering more than 50 different languages. With strong expertise in speech and natural language processing technologies, we have been serving AI companies and Fortune 500 companies since day one. Defined.ai was founded in Seattle and has offices in Lisbon and Porto.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Airflow APIs Architecture Azure CI/CD Computer Science Conversational AI Data pipelines Data Warehousing DevOps Docker Engineering Git Hadoop Kubernetes Machine Learning NLP NoSQL Pipelines Privacy PySpark Python SQL
Perks/benefits: Career development Flex hours Flex vacation Home office stipend Startup environment Team events Unlimited paid time off
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Marketing Data Analyst jobs
- Open MLOps Engineer jobs
- Open Junior Data Scientist jobs
- Open AI Engineer jobs
- Open Data Engineer II jobs
- Open Senior Data Architect jobs
- Open Power BI Developer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Analytics Engineer jobs
- Open Sr Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Principal Data Engineer jobs
- Open Product Data Analyst jobs
- Open Business Data Analyst jobs
- Open Data Quality Analyst jobs
- Open Data Manager jobs
- Open Sr. Data Scientist jobs
- Open Big Data Engineer jobs
- Open Data Scientist II jobs
- Open Business Intelligence Developer jobs
- Open Data Analyst Intern jobs
- Open ETL Developer jobs
- Open Principal Data Scientist jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Business Intelligence-related jobs
- Open Data quality-related jobs
- Open Privacy-related jobs
- Open Data management-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open Deep Learning-related jobs
- Open APIs-related jobs
- Open PyTorch-related jobs
- Open PhD-related jobs
- Open Consulting-related jobs
- Open TensorFlow-related jobs
- Open Snowflake-related jobs
- Open NLP-related jobs
- Open Data governance-related jobs
- Open Data warehouse-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open Databricks-related jobs
- Open LLMs-related jobs
- Open DevOps-related jobs
- Open CI/CD-related jobs