Senior/Principal Software Engineer - Data Analytics
Pune, India
PubMatic
PubMatic maximizes customer value to deliver the programmatic digital marketing supply chain of the future and stay ahead of advertising technology trends.Company Description
PubMatic (Nasdaq: PUBM) is an independent technology company maximizing customer value by delivering digital advertising’s supply chain of the future. PubMatic’s sell-side platform empowers the world’s leading digital content creators across the open internet to control access to their inventory and increase monetization by enabling marketers to drive return on investment and reach addressable audiences across ad formats and devices. Since 2006, our infrastructure-driven approach has allowed for the efficient processing and utilization of data in real time. By delivering scalable and flexible programmatic innovation, we improve outcomes for our customers while championing a vibrant and transparent digital advertising supply chain.
Job Description
PubMatic Data platform is one of the biggest in tech industry with Peta Byte Scale data. Our cluster comprises of thousands of machines and multiple data centres spread across the globe . Given the super high data throughput and scale challenge many proven Big Data tools at times have failed for our use cases. With the help of our brilliant engineering team we have come up with many smart innovative ideas, sometimes employed anti patterns to build a highly scalable and robust data platform.
As part of team expansion, we’re looking for strong Software Development Engineer (Data) to work with us to highly scalable data platforms and services.
Job Description
- Build, design and implement our highly scalable, fault-tolerant, highly available big data platform to process terabytes of data and provide customers with in-depth analytics.
- Developing Big Data pipelines using modern technology stack such as Spark, Hadoop, Kafka, HBase, Hive, Presto etc.
- Developing analytics application ground up using modern technology stack such as Java, Spring, Tomcat, Jenkins, REST APIs, JDBC, Amazon Web Services, Hibernate.
- Building data pipeline to automate high-volume data collection and processing to provide real-time data analytics.
- Customize PubMatic’s reporting and analytics platform based on customer’s requirements from customers and deliver scalable, production-ready solutions.
- Lead multiple projects to develop features for data processing and reporting platform, collaborate with product managers, cross-functional teams, other stakeholders and ensure successful delivery of projects.
- Use various mechanisms established to fetch data from different external data sources and reconcile them with PubMatic’s processed data.
- Collaborate with functional teams to build products to deliver end-to-end products and features and fix bugs for better performance.
- Develop robust & fault-tolerant systems and monitor implications of changes on data processing pipeline and performance.
- Leveraging a broad range of PubMatic’s data architecture strategies and proposing both data flows and storage solutions.
- Managing Hadoop map reduce and spark jobs & solving any ongoing issues with operating the cluster.
- Working closely with cross functional teams on improving availability and scalability of large data platform and functionality of PubMatic software.
- Expertise in developing Implementation of professional software engineering best practices for the full software development life cycle, including coding standards, performing code reviews, committing to Github, preparing documents in Confluence, continuous delivery using Jenkins, automated testing, and operations.
- Participate in Agile/Scrum processes such as sprint planning, sprint retrospective, backlog grooming, user story management, work item prioritization, etc.
- Frequently discuss with product managers about the software features to include in PubMatic Data Analytics platform. Understand the technical aspects customer requirement from product managers.
- Keep in regular touch with quality engineering team which ensure the quality of the platforms/products and performance SLAs of java based micro services and spark based data pipeline.
- Support customer issues over email or JIRA (bug tracking system), provide updates, patches to customers to fix the issues.
- Discuss with technical writing team about the technical documents that are published on documentation portal.
- Perform code and design reviews for code implemented by peers or as per the code review process.
Qualifications
- 8+ years of proven experience in designing, implementing and delivering complex, scalable and resilient platform and services
- Experience in building high throughput big data platforms and systems
- Hands-on experience in big data technologies (Spark/Kafka/Spark streaming) and other open source data technologies
- Experience in OLAP (Snowflake, Vertica or similar) would be an added advantage.
- Ability to understand vague business problems and convert into working solutions
- Excellent spoken and written interpersonal skills with a collaborative approach.
- Dedication to developing high-quality software and products
- Curiosity to explore and understand data is a strong plus
- Deep understanding of Big-Data and distributed systems (MapReduce, Spark, Hive, Kafka, Oozie, Airflow)
#LI-MD1
Additional Information
Return to Office: PubMatic employees throughout the global have returned to our offices via a hybrid work schedule (3 days “in office” and 2 days “working remotely”) that is intended to maximize collaboration, innovation, and productivity among teams and across functions. All PubMatic employees in the US and India are required to be fully vaccinated to return to our offices. Covid-19 boosters are not required at this point in time.
Benefits: Our benefits package includes the best of what leading organizations provide, such as stock options, paternity/maternity leave, healthcare insurance, broadband reimbursement. As well, when we’re back in the office, we all benefit from a kitchen loaded with healthy snacks and drinks and catered lunches and much more!
Diversity and Inclusion: PubMatic is proud to be an equal opportunity employer; we don’t just value diversity, we promote and celebrate it. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Airflow APIs Architecture AWS Big Data Confluence Data Analytics Data pipelines Distributed Systems Engineering GitHub Hadoop HBase Java Jira Kafka Map Reduce OLAP Oozie Open Source Pipelines Scrum SDLC Snowflake Spark Streaming Testing
Perks/benefits: Career development Equity Flex hours Insurance Parental leave
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open MLOps Engineer jobs
- Open Data Science Manager jobs
- Open Lead Data Analyst jobs
- Open Data Manager jobs
- Open Senior Business Intelligence Analyst jobs
- Open Data Engineer II jobs
- Open Power BI Developer jobs
- Open Sr Data Engineer jobs
- Open Principal Data Engineer jobs
- Open Business Intelligence Developer jobs
- Open Data Analytics Engineer jobs
- Open Junior Data Scientist jobs
- Open Product Data Analyst jobs
- Open Data Scientist II jobs
- Open Sr. Data Scientist jobs
- Open Senior Data Architect jobs
- Open Business Data Analyst jobs
- Open Data Analyst Intern jobs
- Open Big Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Data Product Manager jobs
- Open Junior Data Engineer jobs
- Open Data Quality Analyst jobs
- Open Azure Data Engineer jobs
- Open Principal Data Scientist jobs
- Open GCP-related jobs
- Open Data quality-related jobs
- Open Business Intelligence-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Finance-related jobs
- Open Data visualization-related jobs
- Open Deep Learning-related jobs
- Open PhD-related jobs
- Open APIs-related jobs
- Open TensorFlow-related jobs
- Open PyTorch-related jobs
- Open NLP-related jobs
- Open Consulting-related jobs
- Open Snowflake-related jobs
- Open CI/CD-related jobs
- Open LLMs-related jobs
- Open Kubernetes-related jobs
- Open Generative AI-related jobs
- Open Data governance-related jobs
- Open Hadoop-related jobs
- Open Airflow-related jobs
- Open Docker-related jobs