Data Engineer
Budapest
Prezi
Welcome to Prezi, the presentation software that uses motion, zoom, and spatial relationships to bring your ideas to life and make you a great presenter.Prezi is the zooming presentation software that uses an open canvas instead of traditional slides to help people explore ideas, collaborate more effectively, and create visually dynamic presentations. Founded in 2009, and with offices in San Francisco, Budapest and Riga, Prezi provides its users a visually engaging, personalized way to express their ideas anytime, anywhere.
The company’s vision extends well beyond authoring software alone into becoming the inspiration and enabler of world-changing ideas for people, organizations, and businesses. Prezi has enjoyed explosive growth and developed a rapid following of passionate users. More than 85 million people from over 190 countries use Prezi from their desktops, browsers and mobile devices. Prezi is rapidly adding new users each month, and more than 1 Prezi is created every second. The company has over 300 employees and is backed by premier investors, including Accel Partners, Sunstone Capital (based in Copenhagen), and TED. We are looking for a talented software engineer to join our Data Infrastructure Team.
Data @ Prezi
We believe that data analytics should be easy for both technical and non-technical people. We aim to create the tools, build the data platform and train our users to make this possible.
To better understand our users we collect data from all parts of our product and push it to a Kafka cluster and ingest data to Amazon distributed storage S3. We then use open source tools like Apache Gobblin to further process and analyze our data. We collect around 1TB data/day. Analysts who are interested in the data can crunch the data using Zeppelin or can schedule jobs with our ETL system to process the data with Spark/Trino. Data is then exposed through Hive tables for further analyses and reporting.
See more from our talk at Big Things.
Responsibilities
- Keep petabyte-scale data flowing through our pipeline. We have hundreds of data-analytics jobs running every day.
- Define best practices.
- Work closely together with our Data Team
- Design and build data architecture and related tooling including:
- Logging framework
- ETL and batch processing infrastructure
- Realtime data systems, data pipelines, pub/sub interfaces
- Data warehouse modeling and design
- BI tooling, OLAP
- Future proof our system to enable new business opportunities e.g. through machine learning
- Partner with Data Analysts, provide tools and guidance on schema modeling, query optimization
- Oversee data lifecycle, understand it’s current and future use cases and build a scalable, maintainable solution
- Automate data quality assurance and provide best practices to teams.
- Work together with Product Teams and provide them simple APIs to push and pull data
You have
- Experience with k8s (at scale)
- Bachelor’s Degree in Computer Sciences or related field
- Minimum 3 years of experience in software engineering
- Fluency in a scripting language (e.g. Python, Ruby)
- Familiarity with *nix environment and tooling (e.g. bash, ssh, git)
- Up-to-date knowledge about big data tools and techniques
- Experience developing and working with ETL pipelines
- SQL knowledge
- Strong interpersonal and communications skills; ability to consult, partner and work effectively with business partners and technical partners across functions
- Enthusiasm for DevOps: we write it, we run it!
- Excitement for building on open sourced tools as well as contributing to existing products
- Get it done behaviour: you are smart and quick with a focus on delivery
- Bravery to try, pilot and eventually productise new technologies
You might also have
- Experience with Amazon Web Services or similar cloud provider
- Experience with continuous integration, configuration management
- Experience with data warehouse design and maintenance
- Background in statistics, data analysis, data science, and machine learning
What we offer you
- Deploy to production from day one
- A working environment that supports career and skill growth
- People who listen to your opinions and value them
- Remote-first attitude (but we also have a centrally located office in downtown Budapest)
- Receive stock options to become your own employer
- Be yourself - we are proud to be colorful!
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: APIs Architecture Big Data Data analysis Data Analytics Data pipelines Data quality Data warehouse DevOps Engineering ETL Git Kafka Machine Learning OLAP Open Source Pipelines Python Ruby Spark SQL Statistics
Perks/benefits: Career development Equity
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Marketing Data Analyst jobs
- Open MLOps Engineer jobs
- Open Junior Data Scientist jobs
- Open AI Engineer jobs
- Open Data Engineer II jobs
- Open Senior Data Architect jobs
- Open Power BI Developer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Analytics Engineer jobs
- Open Sr Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Principal Data Engineer jobs
- Open Business Data Analyst jobs
- Open Product Data Analyst jobs
- Open Data Quality Analyst jobs
- Open Data Manager jobs
- Open Sr. Data Scientist jobs
- Open Big Data Engineer jobs
- Open Data Scientist II jobs
- Open Business Intelligence Developer jobs
- Open Data Analyst Intern jobs
- Open ETL Developer jobs
- Open Principal Data Scientist jobs
- Open Azure Data Engineer jobs
- Open Data Product Manager jobs
- Open Business Intelligence-related jobs
- Open Data quality-related jobs
- Open Privacy-related jobs
- Open Data management-related jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open Deep Learning-related jobs
- Open APIs-related jobs
- Open PyTorch-related jobs
- Open PhD-related jobs
- Open Consulting-related jobs
- Open TensorFlow-related jobs
- Open Snowflake-related jobs
- Open NLP-related jobs
- Open Data governance-related jobs
- Open Data warehouse-related jobs
- Open Airflow-related jobs
- Open Hadoop-related jobs
- Open Databricks-related jobs
- Open LLMs-related jobs
- Open DevOps-related jobs
- Open CI/CD-related jobs