Cloud Infrastructure EngineerCloud Infrastructure ArchitectCloud Infrastructure ManagerDevOps EngineerICT&SS Professional
Overview
Enhance your
team's capabilities with Reliability Engineering Coaching programme. Improve
performance, scalability, and reliability for smooth, efficient operations.
Develop the
skills needed to maintain robust systems via personalised coaching and a hands-on,
practical approach. Foster continuous improvement, drive innovation and achieve
operational excellence with expert guidance.
Key Takeaways
At the end of this programme, you will be able to:
implement a successful reliability culture in your organisation
understand reliability principles and recognise anti-patterns to avoid them
assess the organisational impact of introducing reliability
improve SLIs and SLOs in a distributed ecosystem, and extend Error Budgets to innovate and mitigate risks
build security and resilience by design in a zero-trust environment
implement full stack observability, distributed tracing, and foster an Observability-driven development culture
curate data using AI for proactive and predictive incident management, and use DataOps for clean data lineage
recognise the importance of Platform Engineering for consistency and reliability
apply practical Chaos Engineering techniques
manage major incident response using an incident command framework and understand unmanaged incidents
appreciate why Reliability Engineering is a pure implementation of DevOps
execute the Reliability Engineering model effectively
understand that reliability is everyone's responsibility
learn from success stories in Reliability Engineering
Who Should Attend
Please refer to the job roles section.
Public Officers interested in modern IT leadership and organisational change approaches and looking to enhance large-scale service scalability and reliability.
Prerequisites
You should have completed the SRE Foundation Programme.
What To Bring
Laptop with good internet connection and Zoom application.
This programme will cover the following topics:
Module 1: Anti-Patterns
Rebranding Ops as Reliability Engineering
Users notice an issue before you do
Measuring until my Edge
False positives are worse than no alerts
Configuration management trap for snowflakes
The Dogpile: Mob incident response
Point fixing
Production Readiness Gatekeeper
Fail-Safe really?
Use Case Discussion
Module 2: SLO is a Proxy for Customer Happiness
Define SLIs that meaningfully measure the reliability of a service from a user’s perspective
Choose appropriate SLO targets, including how to perform statistical and probabilistic analysis
Use error budgets to help your team have better discussions and make better data-driven decisions
Use Case Discussion
Module 3: Building Secure, Scalable and Reliable Systems
Reliability Engineering and its role in Building Secure and Reliable systems
Design for Changing Architecture
Fault tolerant Design
Design for Security
Design for Resiliency
Design for Reliability
Use Case Discussion
Module 4: Full-Stack Observability
Modern applications are Complex & Unpredictable
Slow is the new down
Pillars of Observability
Using Open Telemetry
Use Case Discussion
Module 5: Platform Engineering and AIOPs
Taking a Platform Centric View
AIOps: A big data view to go from reactive to proactive to predictive management
Technology becomes more human through ML, allowing ubiquitous self-service
Use Case Discussion
Module 6: Incident Response Management
Key responsibilities towards incident response
DevOps & ITIL
OODA and Reliability Incident Response
Closed Loop Remediation and the Advantages
Swarming – Food for Thought
Use Case Discussion
Module 7: DiRT and Chaos Engineering
Disaster Recovery Testing
Fault Injection
Chaos Engineering
Tools that can be instrumented for Chaos Engineering
Use Case Discussion
Module 8: Reliability is the Purest form of DevOps
Key Principles of Reliability Engineering
How to increase Reliability across the spectrum
Metrics for Success
Possible implementation Model
Cultural and Behavioural Skills are key
Case Study
Use Case Discussion
Full Fee
Full programme fee
S$1515
9% GST on nett programme fee
S$136.35
Total nett programme fee payable, including GST
S$1651.35
With effect from 1 Jan 2024
NOTE
Funding is available for this programme. Please visit the Learning Partner’s website to find out about the updated programme fee funding breakdown and eligibility, terms and conditions.