Logging, Monitoring, and Observability in Google Cloud

(2 days)

 

Course Description

This course teaches participants techniques for monitoring and improving infrastructure and application performance in Google Cloud.

Using a combination of presentations, demos, hands-on labs, and real-world case studies, attendees gain experience with full-stack monitoring, real-time log management and analysis, debugging code in production, tracing application performance bottlenecks, and profiling CPU and memory usage.

Objectives

  • Explain the purpose and capabilities of Google Cloud’s
    operations suite.
  • Implement monitoring for multiple cloud projects.
  • Create alerting policies, uptime checks and alerts.
  • Install and manage Ops Agent to collect logs for Compute Engine.
  • Explain Cloud Operations for GKE.
  • Analyze VPC Flow Logs and firewall rules logs.
  • Analyze and export Cloud Audit Logs instances.
  • Profile and identify resource-intensive functions in an application.
  • Analyze resource utilization cost for monitoring related components within Google Cloud.

Audience

This class is intended for the following participants:

  • Cloud architects, administrators, and SysOps personnel
  • Cloud developers and DevOps personnel

Prerequisites

To get the most out of this course, participants should have:

  • Completed Google Cloud Fundamentals: Core Infrastructure or have equivalent experience
  • Basic scripting or coding familiarity
  • Proficiency with command-line tools and Linux operating system environments

Course Outline

 

Module 1: Introduction to Google Cloud Operations Suite

  • Describe the purpose and capabilities of Google Cloud’s operations suite
  • Explain the purpose of the Cloud Monitoring tool
  • Explain the purpose of Cloud Logging and Error Reporting tools
  • Explain the purpose of Application Performance Management tools

Module 2: Monitoring Critical Systems

  • Use Cloud Monitoring to view metrics for multiple cloud projects
  • Explain the different types of dashboards and charts that can be built
  • Create an uptime check
  • Explain the cloud operations architecture
  • Explain and demonstrate the purpose of using Monitoring Query Language (MQL) for monitoring

Module 3: Alerting Policies

  • Explain alerting strategies
  • Explain alerting policies
  • Explain error budget
  • Explain why server-level indicators (SLIs), service-level objectives (SLOs), and service-level agreements (SLAs) are important
  • Identify types of alerts and common uses for each Use Cloud Monitoring to manage services

Module 4: Advanced Logging and Analysis

  • Use Log Explorer features
  • Explain the features and benefits of logs-based metrics
  • Define log sinks (inclusion filters) and exclusion filters
  • Explain how BigQuery can be used to analyze logs
  • Export logs to BigQuery for analysis
  • Use log analytics on Google Cloud

Module 5: Working with Audit Logs

  • Explain Cloud Audit Logs
  • List and explain different audit logs
  • Explain the features and functionalities of the different audit logs
  • List the best practices to implement audit logs

Module 6: Configuring Google Cloud Services for Observability

  • Use the Ops Agent with Compute Engine
  • Enable and use Kubernetes Monitoring
  • Explain the benefits of using Google Cloud Managed Service for Prometheus
  • Explain the usage of PromQL to query Cloud Monitoring metrics
  • Explain the uses of Open Telemetry
  • Explain custom metrics

Module 7: Monitoring Google Cloud Network and Data Access

  • Collect and analyze VPC Flow Logs and firewall rules logs
  • Enable and monitor Packet Mirroring
  • Explain the capabilities of the Network Intelligence Center

Module 8: Investigating Application Performance Issues

  • Explain the features and benefits of Error Reporting, Cloud Trace, and Cloud Profiler
  • Explain the functionalities of the Error Reporting, Cloud Trace, and Cloud Profiler

Module 9: Optimizing the Costs for Operations Suite

  • Analyze resource utilization cost for monitoring related components within Google Cloud
  • Implement best practices for controlling the cost of monitoring within Google Cloud