Software Engineer, Data Engineering
Tokyo,Japan
Appier
Comprehensive AI-Powered Solutions: Smoother Operaions. Elevated Customer Experience. Better Performance.About Appier
Appier is a software-as-a-service (SaaS) company that uses artificial intelligence (AI) to power business decision-making. Founded in 2012 with a vision of democratizing AI, Appier’s mission is turning AI into ROI by making software intelligent. Appier now has 17 offices across APAC, Europe and U.S., and is listed on the Tokyo Stock Exchange (Ticker number: 4180). Visit www.appier.com for more information.
About the role
At Appier, we have many opportunities to work with data each and every day. In this role as a Software Engineer, Data Engineering on the Appier Data Engineering team, your primary responsibility will be to partner with key stakeholders, machine learning scientists, data analysts, and software engineers to support and enable the continued growth critical to Appier. You will be responsible for creating the technology and data architecture that moves, translates and stores data used to improve our AI capacity and provide insight to our customers. You will also help translate business needs into requirements and identify efficiency opportunities. In addition to extracting, transforming and storing data, you will be expected to use your expertise to build extensible data models and data government, provide meaningful recommendations and actionable strategies to partnering machine learning scientists and data analysts for performance enhancements and development of best practices, including streamlining of data sources and related programmatic initiatives. The ideal candidate will have a passion for working in white space and creating impact from the ground up in a fast-paced environment. We are looking for all levels of seniority in the space. This is local hire position.
Responsibilities
- Partner with leadership, engineers, product managers, data scientists, and data analysts to understand data needs
- Apply proven expertise and build high-performance scalable data warehouses
- Design, build and launch efficient & reliable data pipelines to move and transform data
- Securely source external data from numerous partners
- Intelligently design data models for optimal storage and retrieval
- Deploy inclusive data quality checks to ensure high quality of data
- Optimize existing pipelines and maintain of all domain-related data pipelines
- Ownership of the end-to-end data engineering component of the solution
- Support on-call shift as needed to support the team
- Design and develop new systems in partnership with software engineers and scientists to enable quick and easy consumption of data
About you
[Minimum qualifications]
- BS/MS in Computer Science or a related technical field
- 2+ years of Python or other modern programming language development experience
- 2+ years of SQL and relational databases experience
- 2+ years experience in custom ETL design, implementation and maintenance
- Experience with workflow management engines (i.e. Airflow,, Google Cloud Composer, AWS Step Functions, or Azure Data Factor)
- Experience with Data Modeling
- Experience with operating Spark or Hadoop farm
- Experience with managing data storage using HDFS and Cassandra
[Preferred qualifications]
- Experience with more than one coding language (i.e. Sala or Java)
- Contributing to open source projects a huge plus (Please include your github page)
- Experience designing and implementing real-time pipelines
- Experience with data quality and validation
- Experience with SQL performance tuning and end-to-end process optimization
- Experience with anomaly/outlier detection
- Experience with notebook-based Data Science workflow
- Experience with Hadoop, Hive, Flink, Storm, Presto and related big data systems is a plus
- Experience with Public Cloud like AWS, Azure, or GCP is a plus
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow Architecture AWS Azure Big Data Cassandra Computer Science Data pipelines Data quality Engineering ETL Flink GCP GitHub Google Cloud Hadoop HDFS Java Machine Learning Open Source Pipelines Python RDBMS Spark SQL Step Functions
Perks/benefits: Career development Startup environment
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open MLOps Engineer jobs
- Open Lead Data Analyst jobs
- Open AI Engineer jobs
- Open Data Engineer II jobs
- Open Sr Data Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Principal Data Engineer jobs
- Open Data Manager jobs
- Open Power BI Developer jobs
- Open Data Analytics Engineer jobs
- Open Junior Data Scientist jobs
- Open Product Data Analyst jobs
- Open Senior Data Architect jobs
- Open Data Scientist II jobs
- Open Business Intelligence Developer jobs
- Open Sr. Data Scientist jobs
- Open Manager, Data Engineering jobs
- Open Data Quality Analyst jobs
- Open Big Data Engineer jobs
- Open Business Data Analyst jobs
- Open Data Analyst Intern jobs
- Open ETL Developer jobs
- Open Principal Data Scientist jobs
- Open Research Scientist jobs
- Open Data Product Manager jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open GCP-related jobs
- Open Privacy-related jobs
- Open Data management-related jobs
- Open ML models-related jobs
- Open Java-related jobs
- Open Data visualization-related jobs
- Open Finance-related jobs
- Open APIs-related jobs
- Open Deep Learning-related jobs
- Open PyTorch-related jobs
- Open Consulting-related jobs
- Open TensorFlow-related jobs
- Open Snowflake-related jobs
- Open PhD-related jobs
- Open NLP-related jobs
- Open CI/CD-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open Airflow-related jobs
- Open Databricks-related jobs
- Open Hadoop-related jobs
- Open LLMs-related jobs
- Open Generative AI-related jobs