Data engineer Lead/architect

Bengaluru, KA, India

Expleo

Expleo is a trusted partner for end-to-end, integrated engineering, quality services and management consulting for digital transformation.

View company page

Overview

Data engineer Lead/architect:

OVERALL 14+ years HANDS-ON EXPERIENCE ON Big Data

Payment gateway experience

Responsibilities

  • Define and achieve the strategy roadmap for Big Data architecture and design and overall enterprise data governance and management across the enterprise and Data Centers.
  • Establish standards and guidelines for the design & development, tuning, deployment and maintenance of information, advanced data analytics, ML/DL, data access frameworks and physical data persistence technologies.
  • Research in the areas of advanced data techniques, including data ingestion, data processing, data integration, data access, data visualization, text mining, data discovery, statistical methods, database design and implementation.
  • Experience in developing use cases, functional specs, design specs, ERDs etc.
  • Experience with the design and development of multiple object–oriented systems.
  • Experience with extending Free and Open–Source Software (FOSS) or COTS products.
  • Architect solutions for key business initiatives ensuring alignment with future state analytics architecture vision.
  • Work closely with the project teams as outlined in the SDLC engagement model to provide guidance in implementing solutions at various stages of projects.
  • Engage constructively with project teams to support project objectives through the application of sound architectural principles.
  • Develop and validate that the proposed solution architecture supports the stated & implied business requirements of the project.
  • Review technical team deliverables for compliance with architecture standards and guidelines.
  • Adopt innovative architectural approaches to leverage in-house data integration capabilities consistent with architectural goals of the enterprise.
  • Create architectural designs for different stakeholders that provide a conceptual definition of the information processing needs for the delivery project.
  • Provide right-fit solution to implement variety of data, analytics requirements using Big Data technologies in Insurance/Healthcare space.
  • Provide expertise in defining Big Data Technical Architecture and design using Hadoop, NoSQL and Visualization tools/platforms.
  • Ability to get down to the programming/code level and provide hands-on expertise in technical features of Big Data tools/platforms.
  • Lead a team of designers/developers and guide them throughout the system implementation life cycle.
  • Engage client Architects, Business SMEs and other stakeholders during Architecture, Design and implementation phases.
  • Define new data infrastructure platform to capture vehicle data. Research and develop new data management solutions, approaches and techniques for data privacy and data security, and advanced Data Analytics systems.
  • Design complex and distributed software modules using Big Data technologies, Streaming Data Technologies and Java/JEE for real-time stream/event data processing and real-time parametrized and full-text searching of data
  • Design NoSQL data models to achieve strong data consistency despite lack of ACID transactions
  • Design and enhance highly scalable, high performance and fault tolerant architectures across all tiers of the software and develop modules based on the architecture
  • Design and enhance the architecture that supports zero downtime requirements while different components of the system are updated/upgraded
  • Integrate with IoT devices using a data ingestion pipeline that allows application of configurable real-time rules
  • Use machine learning algorithms on the real-time data for predictive analytics
  • Design software, write code, write unit test cases, test code and review code on a daily-basis.
  • Lead multiple project modules and development teams simultaneously and ensure the quality of their deliverables and their timely delivery
  • Create/enhance scalable, high performance and fault-tolerant architectures
  • Use caching, queuing, sharding, concurrency control, etc. to improve performance and scalability
  • Enhance zero-downtime architecture during product deployment upgrades
  • Integrate with IoT devices using a data ingestion pipeline that allows application of configurable real-time rules
  • Identify the performance and scalability bottlenecks and provide solutions to resolve them
  • Perform code reviews, provide feedback and oversee code corrections to ensure compliance with the development guidelines on a daily-basis
  • Provide technical expertise in the diagnosis and resolution of issues, including the determination and provision of workaround solutions
  • Experience in data modeling and data processing architectures using NoSQL databases to achieve strong data consistency despite lack of ACID transactions

Qualifications

Preferably BE/ME/BTech/MTech/MSc in computer science or related engineering field or MCA

Should have relevant industry certifications like Big Data Architect

Must possess Analytical, Interpersonal, Leadership & Organizational skills.

Essential skills

Required Skills:

  • Experience with distributed scalable Big Data store or NoSQL, including Cassandra, Cosmos DB, Cloudbase, HBase, or Big Table.
  • Experience in with Apache Hadoop and Hadoop Distributed File System (HDFS), S3 protocol, ObjectStore and with processing large data stores.
  • Experience on Hadoop, MapReduce, YARN, Zookeeper, Pig, Hive, Oozie, Flume, Sqoop, Spark and Storm, Hive, HBase, MongoDB, Cassandra
  • Experience on R, Qlik, Tableau, Postgres, Kerberos, Knox and Ranger
  • Experience on J2EE technologies including Webservices development and Enterprise Application Integration space.
  • Experience with Message Queue based technologies like RabbitMQ or ActiveMQ etc. would be considered plus point
  • Experience with other NoSQL Databases (CouchDB, Neo4j, InfiniteGraph, JanusGraph, ArangoDB, OrientDB, CockroachDB, Redis, Vitess etc.)
  • Real-time stream data processing frameworks: Beam, Spark Streaming, Flink, Kafka Stream, Apex, Heron, Storm etc.
  • Experience in Queuing: Kafka, ActiveMQ, RabbitMQ etc.
  • Experience with ETL design using tools Informatica, Talend, Oracle Data Integrator (ODI), Dell Boomi or equivalent
  • Experience with Big Data & Analytics solutions Hadoop, Pig, Hive, Spark, Spark SQL Storm, AWS (EMR, Redshift, S3, etc.)/Azure (HDInsight, Data Lake Design, Analytical services/Power BI)
  • Experience in developing code around Hadoop with Oozy, Sqoop, Pig, Hive, HBase, Avro, Parquet, Spark, NiFi
  • Experience with NoSQL platform (eg, Spark, Cassandra, Hadoop).
  • Experience with Lucene based cluster technology (eg, Elastic Search, Solr).
  • Experience with One of the queueing system (eg, Apache Kafka, RabbitMQ).
  • Experience with Cloud Computing / Distributed Computing / Amazon Web Services / GCP / AZURE
  • Experience with RESTful Web Services (eg, Jetty) & Java Spring framework

Experience

OVERALL 14+ years HANDS-ON EXPERIENCE ON Big Data

Apply now Apply later
  • Share this job via
  • or

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  1  0  0

Tags: Apex Architecture Avro AWS Azure Big Data Bigtable Cassandra CockroachDB Computer Science Cosmos DB Data Analytics Data governance Data management Data visualization Engineering ETL Flink GCP Hadoop HBase HDFS Informatica Java Kafka Machine Learning MongoDB Neo4j NiFi NoSQL Oozie Oracle Parquet PostgreSQL Power BI Privacy Qlik R RabbitMQ Redshift Research SDLC Security Spark SQL Statistics Streaming Tableau Talend

Region: Asia/Pacific
Country: India

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.