Course 975:
SRE Essentials

(2 days)


Course Description

Site reliability engineering (SRE) is a software engineering approach to IT infrastructure and operations that align incentives between development and operations and also includes mission-critical production support. This workshop starts with an introduction of the main practices of SRE and the role IT and business leaders play in the success of SRE adoption. The course then introduces participants to the way Service Level Indicators (SLIs) and Service Level Objectives (SLOs) should be used to measure a service’s reliability. Attendees will gain some hands-on experience with creating these measures in practice. These concepts help create a culture where the reliability and success of a service can be objectively measured. 

 Learning Objectives

  • Articulate the technical and cultural fundamentals of SRE and understand the value they can provide to your IT operations in any environment
  • Learn SRE Terminology
  • Understand why services need SLOs
  • Achieve developer and operation harmony with error budgets
  • Choose appropriate SLIs based on user journeys
  • Create specific, measurable, achievable, relevant, and time-bound SLOs

Who Should Attend

This workshop is aimed at development and operations engineers and technical managers, but can also be useful for product and business leaders wanting to learn more about what SRE is.

Course Outline


Unit 1: SRE Cultural Practices

  • Blamelessness Postmortems
  • Collaboration and Communication
  • Knowledge Sharing
  • Reducing Organizational Silos
  • Accepting Failure as Normal
  • Service Level Objectives (SLOs)
  • Error Budgets
  • Make Tomorrow Better Than Today
  • Regulate Workload
  • Apply SRE in Your Organization

Unit 2: The Art of SLOs

  • SRE Terminology
    • SLI, SLO, SLA, User
  • Why Your Services Need SLOs
    • It’s All About Reliability and Building User Trust
  • Incentivizing Reliability Across DevOps Teams with Solid SRE Practices
  • Choosing a Good SLI
    • Mapping SLIs to User Journeys
    • The Classic SLI Menu
  • Relating SLIs to Customer Pain
  • Finding SLIs with a Good Signal to Noise Ratio
  • Defining Specific, Measurable, Achievable, Relevant, and Time-Bound SLO Targets
  • Calculating and Leveraging Error Budgets
  • Working with Postmortems

Please Contact Your ROI Representative to Discuss Course Tailoring!