VP, Lead SRE Engineer (Customer Communication Platform), Group Consumer Banking and Big Data Analytics Technology, Technology & Operations
Business Function Group Technology and Operations (T&O) enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group T&O, we manage the majority of the Bank's operational processes and inspire to delight our business partners through our multiple banking delivery channels. Responsibilities Site Reliability Engineering (SRE) at DBS combines software and systems engineering to build run, and maintain high performant, distributed, fault tolerant and resilient financial systems. Site Reliability Engineers focus on ensuring our customer and colleagues experience best of DBS systems. As a Site Reliability Engineer you will be filling a mission-critical role ensuring that our systems are healthy, monitored, automated, fault tolerant and designed to scale. You will collaborate and work closely with engineering teams to continually improve our production services, facilitating fast delivery of new products, and reducing downtime. Site Reliability Engineers utilize automation, continuous monitoring, tools and solid engineering principles to optimize existing systems, build infrastructure and eliminate operational work. DBS Technology and Operations is looking for passionate, creative and detail oriented engineers who excel on solving operational problems and improving efficiency.
Proactive management of our production services by measuring and monitoring availability, latency, throughput, user journeys and overall system health. Support services before they go live through activities such as system design inputs, developing software platforms and frameworks, capacity planning and launch reviews. Engage with product engineering teams to test against relevant Chaos Engineering tool kit. Define SLI, SLO and Error budget for the system/s Define and practice sustainable incident management in a blameless postmortem culture. Identify and build automated tools for monitoring and reporting Assist with planning and execution of new software releases with development teams. Work with teams located with various geographies in Asia Pacific Requirements
Bachelor's or Master's degree in Computer Science, a related technical field that involves programming, or equivalent practical experience. Minimum of 10 years technology experience. 3+ years' hands-on SRE experience is required. 2+ years' people management of more than 5 members. 3+ years working experience for mananging production support/SRE team. 2+ years' Incident management tasks related to root cause analysis, problem management and blameless post-mortem. Hands on experience on Cloud Automation deployment process. E.g. Kubernets, openshift, aws etc. Experience in one of the development languages: Java and Python. Good knowledge about networking and architecture design. Good Knowledge about database design. Very good analytical and problem solving skills with good understanding of technical risks emerging out of architecture decisions. Good knowledge about devops process. E.g. Jenkins, teamcity, jira etc. Must have effective communication skills and a sense of ownership and drive. Passionate about solving operational problems and constant improvement Highly motivated, pro-active and capable of working under pressure without compromising processes and productivity. Strong, committed and reliable team player and strong communicator, able to take direction but also willing to contribute to discussions on design and strategy. Possess client-facing skills to be able to deal with and form good relationships with the business and other technology groups through day to day support and project work. Interest in financial technologies, new technology tools and the ability to learn. Apply Now
We offer a competitive salary and benefits package and the professional advantages of a dynamic environment that supports your development and recognises your achievements.