PlayStation isn’t just the Best Place to Play —it’s also the Best Place to Work. We’ve thrilled gamers since 1994, when we launched the original PlayStation. Today, we’re recognized as a global leader in interactive and digital entertainment. The PlayStation brand falls under Sony Interactive Entertainment, a wholly-owned subsidiary of Sony Corporation.
Data Science Engineer, Machine Learning Platform
PlayStation HQ in San Mateo, CA
We seek a Platform Engineer to support our transitioning to the cloud. In this role you will deliver tools for our new machine learning, analytics and big data cloud-based platform hosted on AWS and based on EMR Permanent and Transitory Clusters, S3 storage and in house software and open source tools. This will empower our global teams to quickly use advanced Machine Learning for a variety of problems. We value positive personalities that inspire to make change. If this is you, please apply!
- Collaborate globally with data and cloud engineers to build a Machine Learning AWS-based platform.
- Engage with data scientists to improve the platform, assure we conform to standard methodologies that meet our requirements.
- Perform creative and complex application programming activities, coding, testing, implementation and documentation of solution.
- Evaluate and debug services in all stages of the development cycles, from development to production.
- Document new and existing projects to improve community understanding and contribution.
- Strong experience in crafting, deploying and operating highly available, scalable and fault tolerant systems using Amazon Web Services (EMR Clusters, S3, ELBs, EC2, EBS).
- Strong working knowledge of deploying and configuring Apache Spark clusters, ideally on EMR clusters.
- Strong proficiency in Python.
- Detailed knowledge/understanding of more than one version control system, including git.
- Knowledge of large open source projects and how they operate preferably Airflow.
- Adept working within unix-like environments; shell scripting and system level knowledge.
- Practical exposure to Continuous Integration/Continuous Delivery tools like Jenkins to merge development with testing through pipelines.
- Big-Data Cloud Scalability.
- Hive metastore and Hadoop.
- JDBC/ODBC, SQL query processing, and distributed query engines.
- Configuration Management tools like Ansible and Terraform.
- Docker container infrastructure.
- Monitoring and logging tools like Splunk.
- Jupyterhub deployment and Apache Livy integration.
- Visualization tools such as Tableau.