Cloud Infra_Icon_1500px

Site Reliability Engineering Foundation

Enquiry
Programme Code D199A
Domain
Cloud Infrastructure
Level
Foundation
Learning Partner(s)
NTUC LearningHub
Duration
2 Days
Format E-learning
Rating
Competencies
Development Support Logging & Metrics DevOps Methodologies Ops Excellence
Job Roles
ICT&SS Professional DevOps Engineer Cloud Infrastructure Manager Cloud Infrastructure Architect Cloud Infrastructure Engineer

Overview

Learn the principles and practices essential for your organisation to scale critical services reliably and economically.

Site Reliability Engineering (SRE) is a discipline that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The key objectives are to create ultra-scalable and highly reliable distributed software systems.

Introducing a site-reliability dimension requires organisational re-alignment, a new focus on engineering and automation, as well as the adoption of a range of new working paradigms.

Key Takeaways

At the end of this programme, you will be able to:
  • Discover the history of SRE and its emergence at Google
  • Understand the inter-relationship of SRE with DevOps and other popular frameworks
  • Understand the underlying principles behind SRE
  • Understand Service Level Objectives (SLOs) and their user focus
  • Understand Service Level Indicators (SLI’s) and the modern monitoring landscape
  • Identify error budgets and the associated error budget policies
  • Understand toil and its effect on an organisation’s productivity
  • Identify some practical steps that can help to eliminate toil
  • Understand observability as something to indicate the health of a service
  • Understand SRE tools, automation techniques and the importance of security
  • Apply anti-fragility, the approach to failure and failure testing
  • Understand the organisational impact that SRE can bring to an organisation

Who Should Attend

Please refer to the job roles section.

Prerequisites

  • Prior knowledge of DevOps, which can be achieved by attending: IT14A05 - DevOps Foundation.
  • It is recommended that you have prior working experience or knowledge in IT software development or IT industry operations.

What To Bring

  • Hardware and Software
  • This programme will be conducted as a Virtual Live Class (VLC) via the Zoom platform. You must own a Zoom account and have a laptop or a desktop with “Zoom Client for Meetings” installed. This can be downloaded from https://zoom.us/download.
  • Please ensure that your computer or laptop meets the following requirements:
    • Operating system: Windows 10 or MacOS (64-bit or above)
    • Processor/CPU: 1.8 GHz, 2-core Intel Core i3 or higher
    • Minimum 20 GB hard disk space.
    • Minimum 8 GB RAM
    • Webcam (The camera must be turn on during the entire duration of the class)
    • Microphone
    • Internet connection: wired or wireless broadband
    • The latest version of Zoom software is to be installed on your computer or laptop before the class
  • Good to have a wired internet connection to provide you with a stable and reliable connection.
  • Recommended to have dual monitors to improve your training experience, enabling you to simultaneously participate in hands-on exercises and maintain engagement with your instructor.

Programme Structure

This programme will cover the following topics:

Module 1: SRE Principles and Practices

  • What is Site Reliability Engineering?
  • SRE and DevOps: What are the Differences?
  • SRE Principles and Practices

Module 2: Service Level Objectives and Error Budgets

  • Service Level Objectives (SLO’s)
  • Error Budgets and Error Budget Policies

Module 3: Reducing Toil

  • What is Toil?
  • Why is Toil Bad?
  • Doing Something About Toil

Module 4: Monitoring & Service Level Indicators

  • Service Level Indicators (SLI’s)
  • Monitoring and Observability

Module 5: SRE Tools and Automation

  • Automation Focus
  • Hierarchy of Automation Types
  • Secure Automation
  • Automation Tools

Module 6: Anti-Fragility and Learning from Failure

  • Why Learn from Failure?
  • Benefits of Anti-Fragility
  • Shifting the Organisational Balance

Module 7: Organisational Impact of SRE

  • Why Organisations Embrace SRE?
  • Patterns for SRE Adoption
  • Sustainable Incident Response
  • Blameless Post-Mortems
  • SRE and Scale

Module 8: SRE, Other Frameworks and Trends

  • SRE and Other Frameworks
  • SRE Evolution
  • Additional Sources of Information


Certificate Obtained and Conferred By:

  • Certificate of Completion from NTUC LearningHub 

Upon meeting 75% attendance and passing the assessment(s), you will receive a Certificate of Completion from NTUC LearningHub.

  • Statement of Attainment from SkillsFuture Singapore

Upon meeting at least 75% attendance and passing the assessment(s), you will receive a Statement of Attainment from SkillsFuture Singapore to certify that the you has achieved the following Competency Standard(s): Quality Engineering (ICT-DIT-3011-1.1)


External Certification Exam:

After registration, you will receive a DevOps exam voucher three days before the date of programme commencement from NTUC LearningHub. After completing the programme with 75% attendance achieved, you can proceed to register and sit for the official “DevOps Site Reliability Engineering Foundation” exam on DevOps Institute online portal. You must complete the exam within the validity date of the exam voucher.

DevOps Site Reliability Engineering Foundation Exam Details
Number of Questions: 40
Question Format: Multiple-choice
Exam Duration: 60 minutes
Passing Score: 26 out of 40 (65%)

After completing this programme with at least 75% attendance and upon passing the official “DevOps Site Reliability Engineering Foundation” certification exam, you will receive a Certified Site Reliability Engineering Foundation certification from DevOps Institute. The certification is governed and maintained by DevOps Institute.


Full Fee

Full programme fee

S$1400

9% GST on nett programme fee

S$126

Total nett programme fee payable, including GSTS$1526

With effect from 1 Jan 2024

NOTE

Please visit the learning provider’s website to find out about the programme.

Prices are subject to other NTUC LearningHub miscellaneous fees.

Upcoming Classes

Class 1
09 May 2024 to 10 May 2024 (Full Time)
Duration: 2 days
When: May - 09, 10
Time : 9.00am - 6.00pm
Class 2
20 Jun 2024 to 21 Jun 2024 (Full Time)
Duration: 2 days
When: Jun - 20, 21
Time : 9.00am - 6.00pm
Class 3
19 Aug 2024 to 20 Aug 2024 (Full Time)
Duration: 2 days
When: Aug - 19, 20
Time : 9.00am - 6.00pm
Class 4
04 Nov 2024 to 05 Nov 2024 (Full Time)
Duration: 2 days
When: Nov - 04, 05
Time : 9.00am - 6.00pm

Agency-sponsored

Step 1 Apply through your organisation's training request system.

Step 2 Your organisation's training request system (or relevant HR staff) confirms your organisation's approval for you to take the programme.

Your organisation will send registration information to the academy.

Organisation HR L&D or equivalent staff can click here for details of the registration submission process.


Step 3 GovTech Digital Academy will inform you whether you have been successful in enrolment.