Google BigQuery for Data Analysts

(3 days) 

 

This 3-day instructor-led class introduces participants to Google BigQuery. Through a combination of instructor-led presentations, demonstrations, and hands-on labs, students learn how to store, transform, analyze, and visualize data using Google BigQuery.

Objectives

At the end of this course, participants will be able to:

  • Understand the purpose of and use cases for Google BigQuery
  • Describe ways in which customers have used Google BigQuery to improve their businesses
  • Understand the architecture of BigQuery and how queries are processed
  • Interact with BigQuery using the web UI and command-line interface
  • Identify the purpose and structure of BigQuery schemas and data types
  • Understand the purpose of and advantages of BigQuery destinations tables and caching
  • Use BigQuery jobs
  • Transform and load data into BigQuery
  • Export data from BigQuery
  • Store query results in a destination table
  • Create a federated query
  • Export log data to BigQuery and query it
  • Understand the BigQuery pricing structure and evaluate mechanisms for controlling query and storage costs
  • Identify best practices for optimizing query performance
  • Troubleshoot common errors in BigQuery
  • Use various BigQuery functions
  • Use external tools such as spreadsheets to interact with BigQuery
  • Visualize BigQuery data
  • Use access controls to restrict access to BigQuery data
  • Query Google Analytics Premium data exported to BigQuery

Audience

This class is intended for data analysts and data scientists responsible for: analyzing and visualizing big data, implementing cloud-based big data solutions, deploying or migrating big data applications to the public cloud, implementing and maintaining large-scale data storage environments, and transforming/processing big data.

Prerequisites

Before attending this course, you should have:

  • Attended CP100A – Google Cloud Platform Fundamentals OR CPB100 – Google Cloud Platform Big Data & Machine Learning Fundamentals (or equivalent experience)
  • Experience using a SQL-like query language to analyze data

Course Outline

Module 1: Introducing Google BigQuery

  • Understand the purpose of and use cases for Google BigQuery
  • Describe ways in which customers have used Google BigQuery to improve their businesses
  • Lab: Sign Up for the Free Trial and Create a Project

Module 2: BigQuery Functional Overview

  • Describe the components of a BigQuery project
  • Identify how BigQuery stores data and list the advantages of the storage model
  • Understand the architecture of BigQuery and how queries are processed
  • Describe the methods of interacting with BigQuery
  • Lab: Explore BigQuery Interfaces

Module 3: BigQuery Fundamentals

  • Describe the purpose of denormalizing data
  • Identify the purpose and structure of BigQuery schemas and data types
  • Explain the types of actions available in BigQuery jobs
  • Understand the purpose of and advantages of BigQuery destinations tables and caching

Lab: BigQuery Components and Jobs

  • Explore how data is organized in BigQuery
  • Learn about the two types of table schemas
  • Learn about jobs, and how to cancel them
  • Investigate caching and destination tables

Module 4: Ingesting, Transforming, and Storing Data

  • Describe the methods for ingesting data, transforming data, and storing data using BigQuery
  • Explain the function of BigQuery federated queries

Lab 4, Part I: Loading Data into BigQuery and Using Federated Queries

  • Load a CSV file into a BigQuery table using the web UI
  • Load a JSON file into a BigQuery table using the CLI
  • Transform data and join tables using the web UI
  • Store query results in a destination table
  • Query a destination table using the web UI to confirm your data was transformed and loaded correctly
  • Export query results from a destination table to Google Cloud Storage
  • Create a federated query that queries data in Cloud Storage

Lab 4, Part II: Exporting App Engine Logs to BigQuery

  • Set up Google Cloud Logging to export App Engine log data from the Guestbook application
  • Use the BigQuery web UI to query the log data

Module 5: Pricing and Quotas

  • Explain the advantages of the BigQuery pricing model
  • Use the pricing calculator to calculate storage and query costs
  • Identify the quotas that apply to BigQuery projects

Lab: BigQuery Pricing

  • Evaluate the size of a query within BigQuery using the BigQuery web UI
  • Use the Pricing Calculator and the total size of the query to estimate the query cost
  • Examine how changing a query affects query cost

Module 6: Clauses and Functions

  • Explain the differences between BigQuery SQL and ANSI SQL
  • Identify the purpose of and use cases for user-defined functions
  • Explain the purpose of various BigQuery functions
  • Lab: BigQuery Clauses and Functions

Module 7: Nested and Repeated Fields

  • Identify the purpose and structure of BigQuery nested, repeated, and nested repeated fields
  • Describe the use cases for nested, repeated, and nested repeated fields
  • Lab: Nested Fields
  • Lab: Repeated Fields
  • Lab: Nested Repeated Fields

 

Module 8: Query Performance

  • Explain the impact of the following in query performance: JOIN and GROUP BY, table wildcards, and table decorators
  • Identify various best practices for optimizing query performance
  • Lab: BigQuery Best Practices and Optimization Techniques

Module 9: Troubleshooting Errors

  • Describe how to handle the most common BigQuery errors: request encoding errors, resource errors, and HTTP errors
  • Lab: Handling Errors

Module 10: Access Control

  • Describe the purpose of access control lists in BigQuery
  • List and explain the project and dataset roles available in BigQuery
  • Apply views for row-level security
  • Lab: Access Control

Module 11: Exporting Data

  • List the methods of exporting data from BigQuery and the data formats available
  • Describe the process of creating a job to export data from BigQuery
  • Explain the purpose of wildcard exports to partition export data
  • Lab: Exporting Data

Module 12: Interfacing with External Tools

  • Describe how to use external tools to interface with BigQuery, including: spreadsheets, ODBC and JDBC drivers, the BigQuery encrypted client, and R
  • Lab: Interfacing with External Tools

Module 13: Working with Google Analytics Premium Data

  • Describe the schema of the Google Analytics Premium and AdSense data exported to BigQuery
  • Lab: Working with Google Analytics Premium Data

Module 14: Data Visualization

  • Describe the options available for visualizing BigQuery data
  • Lab: Visualizing Data