Hands on programming skills on - Scala, Python, Shell Scripting
Should have exposure to RDBMS, No-SQL like HBase etc...
Hands on experience on Versioning system like - Git, BitBucket
Sound knowledge on Spark Query tuning and performance optimization.
Experience working on Cloudera, HDFS
Deep understanding of distributed systems (e.g. CAP theorem, partitioning, replication, consistency,
Advantage on having knowledge on Real-time tool - Kafka
Advantage on having knowledge on Cloud Platforms, i.e Google Clod Platform, AWS or Azure
Added advantage to have awareness of Agile Methodology and tools like JIRA
Leadership, drive outcomes with upstream teams, engage with peer engineers, lead them to solutions that enable connectivity, be it code or system changes on their side, or simply new functional accounts.
Responsible to design and build analytics project which include task like data ingestion, data transformation, data quality, data engineering and data governance in Big Data.
Design, architecture and implementation of complex information system using various Big Data components, Batch(ETL/ELT) and streaming APIs, complex data model design.
Design and Create Scala/Spark jobs for data transformation and aggregation
Responsible to process huge volume of structured and semi structured data and ingest data in data lake.
Responsible for SDLC, including analysis, design, development, implementation, support and enhancement.
Understand banking remediation issues and provided the solutions to handle in big data
Work on defect fixes, enhancement and production support of different applications.
Develop SQL queries to perform data analysis and data validation.
Develop migration scripts to deploy the code from one environment to another environment.
Technical support to business system analyst and customers to assist them with resolution of business incidents/tickets
Researching new tools, technologies and practices for improving system efficiency.