AVP, Data Engineer, Group Consumer Banking and Big Data Analytics Technology, Technology & Operations
Group Technology and Operations (T&O) enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group T&O, we manage the majority of the Bank's operational processes and inspire to delight our business partners through our multiple banking delivery channels.
Featuremart is a mart for reusable features that can be used across Data science models. Responsibilities
- Data Engineer should be able to understand the Business requirements, Functional and Technical requirements and translate them effectively using the Spark Framework into T3 jobset for specific use cases and/or features' inputs for models.
- Should be able to understand the complex transformation logic and translate them to Hive/Spark-SQL queries.
- Hands on experience on dealing with live streaming datasets using Kafka/Spark Streaming.
- Excellent oral and written communication skills, On-time delivery and good team player.
- Hands on experience on Spark Core, Spark-SQL, Scala-Programming and Streaming datasets in Big Data platform
- Should have extensive working experience in Hive and other components of Hadoop eco system (HBase, Zookeeper, Kafka and Flume)
- Should be able to understand the complex transformation logic and translate them to Spark-SQL queries.
- Unix Shell Scripting and setting up CRON jobs
- Should have worked on Cloudera distribution framework, Oozie Workflow (or any Scheduler), Jenkins (or any version controller)
- Prior experience in Consumer Banking Domain is an advantage
- Prior experience in agile delivery method is an advantage
- Excellent understanding of technology life cycles and the concepts and practices required to build big data solutions
- Familiar with Data Warehouse and Models Featuremart concepts
- Core Java skillset is an added advantage
- Ability to understand and build re-usable data assets or features to enable downstream data science models to read and use
- Good Knowledge and experience in any database (Teradata or Data Lake or Oracle or SQL Server) is a plus
- Preferably a master's or bachelor's degree in Computer Science, Engineering, Technology or Data Science.