Principal SRE Resilience Engineer Principal SRE Resilience Engineer …

in Toronto, ON, Canada
Permanent, Full time
Be the first to apply
in Toronto, ON, Canada
Permanent, Full time
Be the first to apply
Principal SRE Resilience Engineer
Requisition ID: 75930

Join the Global Community of Scotiabankers to help customers become better off.
As Scotiabank's engine of modernization, the PLATO platform enables technology teams to build software quickly and securely using modern practices. PLATO is an integrated set of technical capabilities, services and processes that encapsulate critical enterprise functions through standardization, re-use and automation.
The PLATO team is comprised of engineers, problem solvers, agilists and creatives in roles such as Enterprise Platform Engineering and Architecture, Enterprise Data Services, Cloud Infrastructure and Architecture, Product Engineering, and Product Management. Together, the team provides the platform that enables the Bank to deliver transformative experiences that help our 24 million customers become better off.
Interested in joining an agile team that's impacting change for our customers around the world? Watch our video
Job Purpose:
Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that all BNS services-both our internally critical and externally-visible systems-have reliability and uptime appropriate to client needs, are continuously improved while keeping a watchful eye on capacity and performance. The ideal candidate for the Site Reliability Engineer position typically self-identifies as a "hacker", who is both a "jack of all trades" as well as possesses deep knowledge in multiple areas of software development, Linux/unix systems administration, networking, internet protocols, databases, and distributed systems. The ideal candidate has a mix of software development and infrastructure operations skills, and approaches infrastructure operations from the perspective of a software engineer. Relevant industry experience is important, but ultimately less so than your demonstrated abilities and attitude. You will work with some incredibly talented and passionate engineers within an engineering team with a strong technology background as well as working in a small team of SREs, supporting a medium-sized team of software engineers working on building the next generation cloud environments, platforms and digital products.
Key Responsibilities
    • Set standards & provide requirements for engineering teams to deliver ops-ready software
    • Understand hardware systems, scaling and benchmarking hardware platforms for scalability
    • Experience with DevOps and build/release pipelines
    • Set technology / technical standards for development t
    • Hands-on with Ansible, Chef, Puppet, Ruby, Terraform, PowerShell/Bash/Python, Docker and Kubernetes
    • Experience with InfoSec certifications and remediation, Patch distribution
    • Experience with Enterprise Data Warehouse and Data technologies
    • Experience with collection, parsing and analyzing data related to system performance
    • Enable self-service model of execution and actively decentralizing decision making process
    • Experience with 24/7 site monitoring and own uptime & performance SLA's
    • Real world experience with Disaster recovery protocols and processes
    • Operational triage, problem identification, handling requests/incidents queue
    • Scripting, automating tasks, OS patching, migrations and security
    • Support Highly available and resilient system providing proactive Disaster Recovery
    • Develop deployment and automation tools to manage a growing number of services.
    • Manage server/storage deployments and implement upgrades.
    • Install, configure, monitor, maintain, troubleshoot, and backup system to ensure functionality, data consistency, security, and usability.
    • Identify potential issues in system performance and implement solutions.

Functional Competencies:
  • Strong driver of new and emerging technologies
  • Proven leadership in building and delivering software platforms, as well as execution of major projects
  • Excellent skills in relationship and general business management
  • Highly skilled and knowledgeable in complex technology environments
  • Strong IT technology experience in high-volume, complex, customer-oriented IT environment
  • Deep product deployment and automation experience in Google Kubernetes and Docker environment
  • Hands on experience on wide range of Google & Microsoft services including BigQuery, Azure, etc.
  • 2+ years as a Site Reliability Engineer or Dev Ops Engineer
  • Strong fundamentals of security, network isolation, firewalls
  • Has been responsible for uptime, upgrades, reliability , and operations of a SaaS platform
  • Built cloud ops teams from the ground up
Preferred Skills
    • Experience managing conversion of on-prem systems to Cloud solutions
    • Scaled systems to 1000s of servers in deployment
    • Experience in automation enablement for CI/CD
    • Experience with algorithms, data structures, complexity analysis and software design
    • Experience in one or more mainstream programming languages
    • Interest in designing, analyzing and troubleshooting large-scale distributed systems.
    • Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
    • Ability to debug and optimize code and automate routine tasks.
Location(s): Canada : Ontario : Toronto
As Canada's International Bank, we are a diverse and global team. We speak more than 100 languages with backgrounds from more than 120 countries. Our employees are committed to a superior customer experience and use the Bank's six guiding sales practice principles to ensure they act with honesty and integrity.

At Scotiabank, we value the unique skills and experiences each individual brings to the Bank, and are committed to creating and maintaining an inclusive and accessible environment for everyone. If you require accommodation (including, but not limited to, an accessible interview site, alternate format documents, ASL Interpreter, or Assistive Technology) during the recruitment and selection process, please let our Recruitment team know. If you require technical assistance, please click here . Candidates must apply directly online to be considered for this role. We thank all applicants for their interest in a career at Scotiabank; however, only those candidates who are selected for an interview will be contacted.

Job Segment: Bank, Banking, Developer, Engineer, Unix, Finance, Technology, Engineering