Data engineer Lead/architect
Bengaluru, KA, India
Expleo
Expleo is a trusted partner for end-to-end, integrated engineering, quality services and management consulting for digital transformation.Overview
Data engineer Lead/architect:
OVERALL 14+ years HANDS-ON EXPERIENCE ON Big Data
Payment gateway experience
Responsibilities
- Define and achieve the strategy roadmap for Big Data architecture and design and overall enterprise data governance and management across the enterprise and Data Centers.
- Establish standards and guidelines for the design & development, tuning, deployment and maintenance of information, advanced data analytics, ML/DL, data access frameworks and physical data persistence technologies.
- Research in the areas of advanced data techniques, including data ingestion, data processing, data integration, data access, data visualization, text mining, data discovery, statistical methods, database design and implementation.
- Experience in developing use cases, functional specs, design specs, ERDs etc.
- Experience with the design and development of multiple object–oriented systems.
- Experience with extending Free and Open–Source Software (FOSS) or COTS products.
- Architect solutions for key business initiatives ensuring alignment with future state analytics architecture vision.
- Work closely with the project teams as outlined in the SDLC engagement model to provide guidance in implementing solutions at various stages of projects.
- Engage constructively with project teams to support project objectives through the application of sound architectural principles.
- Develop and validate that the proposed solution architecture supports the stated & implied business requirements of the project.
- Review technical team deliverables for compliance with architecture standards and guidelines.
- Adopt innovative architectural approaches to leverage in-house data integration capabilities consistent with architectural goals of the enterprise.
- Create architectural designs for different stakeholders that provide a conceptual definition of the information processing needs for the delivery project.
- Provide right-fit solution to implement variety of data, analytics requirements using Big Data technologies in Insurance/Healthcare space.
- Provide expertise in defining Big Data Technical Architecture and design using Hadoop, NoSQL and Visualization tools/platforms.
- Ability to get down to the programming/code level and provide hands-on expertise in technical features of Big Data tools/platforms.
- Lead a team of designers/developers and guide them throughout the system implementation life cycle.
- Engage client Architects, Business SMEs and other stakeholders during Architecture, Design and implementation phases.
- Define new data infrastructure platform to capture vehicle data. Research and develop new data management solutions, approaches and techniques for data privacy and data security, and advanced Data Analytics systems.
- Design complex and distributed software modules using Big Data technologies, Streaming Data Technologies and Java/JEE for real-time stream/event data processing and real-time parametrized and full-text searching of data
- Design NoSQL data models to achieve strong data consistency despite lack of ACID transactions
- Design and enhance highly scalable, high performance and fault tolerant architectures across all tiers of the software and develop modules based on the architecture
- Design and enhance the architecture that supports zero downtime requirements while different components of the system are updated/upgraded
- Integrate with IoT devices using a data ingestion pipeline that allows application of configurable real-time rules
- Use machine learning algorithms on the real-time data for predictive analytics
- Design software, write code, write unit test cases, test code and review code on a daily-basis.
- Lead multiple project modules and development teams simultaneously and ensure the quality of their deliverables and their timely delivery
- Create/enhance scalable, high performance and fault-tolerant architectures
- Use caching, queuing, sharding, concurrency control, etc. to improve performance and scalability
- Enhance zero-downtime architecture during product deployment upgrades
- Integrate with IoT devices using a data ingestion pipeline that allows application of configurable real-time rules
- Identify the performance and scalability bottlenecks and provide solutions to resolve them
- Perform code reviews, provide feedback and oversee code corrections to ensure compliance with the development guidelines on a daily-basis
- Provide technical expertise in the diagnosis and resolution of issues, including the determination and provision of workaround solutions
- Experience in data modeling and data processing architectures using NoSQL databases to achieve strong data consistency despite lack of ACID transactions
Qualifications
Preferably BE/ME/BTech/MTech/MSc in computer science or related engineering field or MCA
Should have relevant industry certifications like Big Data Architect
Must possess Analytical, Interpersonal, Leadership & Organizational skills.
Essential skills
Required Skills:
- Experience with distributed scalable Big Data store or NoSQL, including Cassandra, Cosmos DB, Cloudbase, HBase, or Big Table.
- Experience in with Apache Hadoop and Hadoop Distributed File System (HDFS), S3 protocol, ObjectStore and with processing large data stores.
- Experience on Hadoop, MapReduce, YARN, Zookeeper, Pig, Hive, Oozie, Flume, Sqoop, Spark and Storm, Hive, HBase, MongoDB, Cassandra
- Experience on R, Qlik, Tableau, Postgres, Kerberos, Knox and Ranger
- Experience on J2EE technologies including Webservices development and Enterprise Application Integration space.
- Experience with Message Queue based technologies like RabbitMQ or ActiveMQ etc. would be considered plus point
- Experience with other NoSQL Databases (CouchDB, Neo4j, InfiniteGraph, JanusGraph, ArangoDB, OrientDB, CockroachDB, Redis, Vitess etc.)
- Real-time stream data processing frameworks: Beam, Spark Streaming, Flink, Kafka Stream, Apex, Heron, Storm etc.
- Experience in Queuing: Kafka, ActiveMQ, RabbitMQ etc.
- Experience with ETL design using tools Informatica, Talend, Oracle Data Integrator (ODI), Dell Boomi or equivalent
- Experience with Big Data & Analytics solutions Hadoop, Pig, Hive, Spark, Spark SQL Storm, AWS (EMR, Redshift, S3, etc.)/Azure (HDInsight, Data Lake Design, Analytical services/Power BI)
- Experience in developing code around Hadoop with Oozy, Sqoop, Pig, Hive, HBase, Avro, Parquet, Spark, NiFi
- Experience with NoSQL platform (eg, Spark, Cassandra, Hadoop).
- Experience with Lucene based cluster technology (eg, Elastic Search, Solr).
- Experience with One of the queueing system (eg, Apache Kafka, RabbitMQ).
- Experience with Cloud Computing / Distributed Computing / Amazon Web Services / GCP / AZURE
- Experience with RESTful Web Services (eg, Jetty) & Java Spring framework
Experience
OVERALL 14+ years HANDS-ON EXPERIENCE ON Big Data
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Apex Architecture Avro AWS Azure Big Data Bigtable Cassandra CockroachDB Computer Science Cosmos DB Data Analytics Data governance Data management Data visualization Engineering ETL Flink GCP Hadoop HBase HDFS Informatica Java Kafka Machine Learning MongoDB Neo4j NiFi NoSQL Oozie Oracle Parquet PostgreSQL Power BI Privacy Qlik R RabbitMQ Redshift Research SDLC Security Spark SQL Statistics Streaming Tableau Talend
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Business Intelligence Developer jobs
- Open Data Engineer II jobs
- Open Data Science Manager jobs
- Open BI Analyst jobs
- Open Data Scientist II jobs
- Open Principal Data Scientist jobs
- Open Business Data Analyst jobs
- Open Business Intelligence Engineer jobs
- Open Senior Business Intelligence Analyst jobs
- Open Sr Data Engineer jobs
- Open Data Science Intern jobs
- Open Lead Data Analyst jobs
- Open Sr. Data Scientist jobs
- Open MLOps Engineer jobs
- Open Junior Data Scientist jobs
- Open Azure Data Engineer jobs
- Open Software Engineer, Machine Learning jobs
- Open Marketing Data Analyst jobs
- Open Data Analytics Engineer jobs
- Open Manager, Data Engineering jobs
- Open Data Quality Analyst jobs
- Open Data Analyst II jobs
- Open Junior Data Engineer jobs
- Open Data Engineer III jobs
- Open Product Data Analyst jobs
- Open Data quality-related jobs
- Open Data management-related jobs
- Open GCP-related jobs
- Open Privacy-related jobs
- Open Data pipelines-related jobs
- Open ML models-related jobs
- Open PhD-related jobs
- Open APIs-related jobs
- Open Business Intelligence-related jobs
- Open Data visualization-related jobs
- Open Finance-related jobs
- Open PyTorch-related jobs
- Open LLMs-related jobs
- Open TensorFlow-related jobs
- Open Deep Learning-related jobs
- Open Consulting-related jobs
- Open Generative AI-related jobs
- Open NLP-related jobs
- Open Data governance-related jobs
- Open CI/CD-related jobs
- Open DevOps-related jobs
- Open Kubernetes-related jobs
- Open Snowflake-related jobs
- Open Hadoop-related jobs
- Open Git-related jobs