Data Engineer

Lisbon, Lisbon, Portugal

Applications have closed

Defined.ai

Defined.ai: Dive into the largest AI training data marketplace. Explore smart data for ethical AI and seamlessly buy, sell, or commission top-quality training datasets.

View company page

Who is Defined.ai? Well, from a technical point of view, we leverage the power of a global crowd to provide some of the world’s biggest companies with the high-quality data they need to power their artificial intelligence. We’re instrumental to the progression and development of artificial intelligence and we couldn’t be prouder or more inspired to be involved in an industry that is changing the world.

From a personal point of view, we’re a group of big thinkers, high achievers and creative problem solvers. We bond over our shared love of software engineering, data science, and strong coffee. We like online gaming, running marathons, and team drinks. We celebrate authenticity and diversity and we’re invested in what we do. Our mission? World domination, obviously!


Who are we looking for?

Do you have the drive to work in an innovative and ambitious environment?

We’re looking for someone with a determined and proactive mindset, someone inspired and passionate to help us achieve our goals. Our successful candidate is a strong critical thinker, reliable and transparent, with an ability to learn and communicate. We are looking for someone special to contribute to our unique culture.


What will you do?

  • Design and implement scalable Python-based data pipelines.
  • Make sure all data is readily available for processing by data scientists.
  • Set software engineering tools, platforms, and best practices while performing trade-off analysis to best match engineering, product, and project constraints and expectations.
  • Collaborate with other software engineering teams such as SREs and DevOps to achieve your team’s goals.
  • Help the Product Manager in structuring, breaking down, and prioritizing the product roadmap into backlog work items.
  • Work together with the Platform Engineering team and the Solution Developers team so as to trade off the development of new customer/project features in the platform or off the platform.

Requirements

  • BSc or MSc in Computer Science or similar background.
  • Advanced level of Python- and PySpark-based data pipelines and software quality best practices.
  • Worked with Azure services and pipelines such as Synapse Analytics (mainly PySpark Jobs, Pipelines, and Notebooks), ADLS, Blob Storage, DevOps, and SQL and NoSQL databases.
  • Happy with administering and configuring Azure DevOps Services for all resources his team use (Boards, CI/CD Pipelines, Repos, etc.).
  • Comfortable with evaluating and applying software design and architectural patterns/principles.
  • Knowledgeable of RESTful APIs, from the provider as well as the consumer point of view.
  • Accustomed to working with data lake and data pipeline architectures.
  • Understanding of Data Warehousing concepts.
  • Worked with pipeline orchestration tools like Apache Airflow or Flyte.
  • Comfortable with versioning systems, preferably Git.
  • Proficient in both written and spoken English.


Nice to haves:

  • Orchestration with Kubernetes and containerization with Docker.
  • Experience with Hadoop.
  • Worked in a team with an Agile methodology.
  • Knowledge in one of these: Conversational AI (e.g., Rasa framework), DevOps or Speech engineering

Benefits

You spend a lot of your time at work, so it should be challenging, fun and interesting. At Defined.ai it will be all of those things and more. Here’s what we offer:

  • Flexible working schedule and hybrid model. We know comfort can boost creativity and performance, so you can manage your schedule and work both from one of our modern office spaces or home.
  • Excellent career development opportunities in a high growth company. With us, you can accomplish your career goals and follow a well-described career path with the support of your supervisor.
  • Culture of feedback and continuous improvement. AI is a fast-paced area, so we keep track of tech trends, and we always ask for feedback.
  • An international and diverse team. We have more than 30 nationalities at our 3 locations, and we provide language classes.
  • Continuous training opportunities. You can choose from many options: leveraging hand-on workshops, unlimited access to Udemy and formal development opportunities.
  • We love to have fun together. We joke a lot, and we can't imagine work without fun activities – we already surfed, raced carts and played soccer together.

About Us

Defined.ai offers a platform with multiple data delivery options that leverages machine learning technology and human intelligence to deliver quality-guaranteed training data for AI systems. The platform offers self-service and fully customizable solutions that deliver high-quality project-specific training data, enabling AI products reach market quicker. It is this business model that has allowed Defined.ai to raise a total of $63.6M in funding over 4 rounds. Our value proposition is quality, privacy, speed and scale, covering more than 50 different languages. With strong expertise in speech and natural language processing technologies, we have been serving AI companies and Fortune 500 companies since day one. Defined.ai was founded in Seattle and has offices in Lisbon and Porto.

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Agile Airflow APIs Architecture Azure CI/CD Computer Science Conversational AI Data pipelines Data Warehousing DevOps Docker Engineering Git Hadoop Kubernetes Machine Learning NLP NoSQL Pipelines Privacy PySpark Python SQL

Perks/benefits: Career development Flex hours Flex vacation Home office stipend Startup environment Team events Unlimited paid time off

Region: Europe
Country: Portugal
Job stats:  15  1  0
Category: Engineering Jobs

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.