SVP / VP, Site Reliability Lead, Middle Office Technology - Central Function, Technology and Operations
Group Technology and Operations (T&O) enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group T&O, we manage the majority of the Bank's operational processes and inspire to delight our business partners through our multiple banking delivery channels. The Role
Site Reliability Engineering (SRE) at DBS combines software and systems engineering to build run, and maintain high performant, distributed, fault tolerant and resilient financial systems. Site Reliability Engineers focus on ensuring our customer and colleagues experience the best of DBS systems.
As Site Reliability Lead, you will be leading a team of Site Reliability Engineers to fulfil the mission-critical role of ensuring that our systems are healthy, monitored, automated, fault tolerant and designed to scale. You will collaborate and work closely with Engineering teams to continually improve our production services, facilitating fast delivery of new products, and reducing downtime. You will also be driving innovation in the SRE space, experimenting with and utilizing state-of-the-art technologies like AI/ML to push the envelope of what's possible in terms of performance, resiliency, efficiency and quality. Because at its heart, Site Reliability Engineering is as much about culture and mindset change as tooling and technology, you must be a seasoned Technology Lead who feels perfectly at home leading by example and getting your hands dirty by coding actively alongside your team. If you love to learn, are excited about working at the leading edge and not afraid to fail, we want you. Responsibilities Lead program to build and operationalize next generation monitoring and management tools utilizing principles of observability, AI/ML and deep analytics.
Set up and develop capabilities and practices in the organisation to operate, govern and migrate applications to modern cloud infrastructures and stacks.
Oversee enterprise wide programs within the bank on improving infrastructure readiness and technology risks, strategic technology modernization and adoption and reduction of technical debt.
Oversee programs on strategic cost optimization and optimization of operational efficiencies
Partner with enterprise function to drive the strategy and adoption for chaos engineering activities, including development of playbooks, new tools and approaches to automate, measure and quantify effectiveness and coverage.
Assist SRE certification and build teams to define and operationalize SLI, SLO and error budgets for the system/s
Lead and drive improvements on incident management process and practices, including adoption of automation and new tools to shorten MTTD & MTTR and reimaging the role of a modern command centre.
Develop, embody and promulgate the practice of blameless post-mortem culture with focus on open sharing, psychological safety and institutionalizing new learnings.
- Bachelor's or Master's degree in Computer Science, a related technical field that involves programming, or equivalent practical experience.
- Min. of 10 years technology experience.
- Prior SRE experience is highly regarded.
- Development experience and solid developer skills in one or more modern programming languages (e.g. Java, Ruby, Java Script, Python, Golang , etc.)
- Experience with testing, especially non-functional testing and good understanding of automated/regression testing.
- Experience with developing applications in a Linux environment, with sound knowledge of algorithms, data structures, complexity analysis and software design.
- Experience handling incidents and tasks related to root cause analysis, problem management and blameless post-mortems.
- Very good analytical and problem-solving skills with deep understanding of technical risks emerging out of architecture decisions.
- Understand complex architectures and well versed with design patterns.
- Systematic problem-solving approach coupled with effective communication skills and a sense of ownership and drive.
- Ability to debug and optimize code and to automate routine tasks.
- Passionate about solving operational problems and constant improvement
- Highly motivated, pro-active and capable of working under pressure without compromising development processes and productivity.
- Strong, committed and reliable team player and strong communicator, able to take direction but also willing to contribute to discussions on design and strategy.
- Possess client-facing skills to be able to deal with and form good relationships with the business and other technology groups through day to day support and project work.
- Interest in financial technologies, new technology tools and the ability to learn quickly.
- A team player with excellent communication and interpersonal skills.
We offer a competitive salary and benefits package and the professional advantages of a dynamic environment that supports your development and recognises your achievements.