Google Cloud Big Data and Machine Learning Fundamentals

(1 day)


This course will introduce you to Google Cloud’s big data and machine learning functions. You’ll begin with a quick overview of Google Cloud and then dive deeper into its data processing capabilities.

Course Objectives

  • Identify the purpose and value of the key Big Data and Machine Learning products in Google Cloud.
  • Use Cloud SQL and Dataproc to migrate existing MySQL and Hadoop/Pig/Spark/Hive workloads to Google Cloud.
  • Employ BigQuery and Cloud SQL to carry out interactive data analysis.
  • Choose between different data processing products on Google Cloud.
  • Create ML models with BigQuery ML, ML APIs, and AutoML.


  • Data analysts, data scientists, business analysts who are getting started with Google Cloud
  • Individuals responsible for designing pipelines and architectures for data processing, creating and maintaining machine learning and statistical models, querying datasets, visualizing query results, and creating reports
  • Executives and IT decision makers evaluating Google Cloud for use by data scientists


Roughly one year of experience with one or more of the following:

  • A common query language such as SQL
  • Extract, transform, and load activities
  • Data modeling
  • Machine learning and/or statistics
  • Programming in Python

Course Outline

The course includes presentations, demonstrations, and hands-on labs.

Module 1: Introduction to Google Cloud

  • Identify the different aspects of Google Cloud’s infrastructure
  • Identify the big data and ML products that form Google Cloud

Module 2: Product Recommendations Using Cloud SQL and Spark

  • Review how businesses use recommendation models
  • Evaluate how and where you will compute and store your housing rental model results
  • Analyze how running Hadoop in the cloud with Dataproc can enable scale
  • Evaluate different approaches for storing recommendation data off-cluster

Module 3: Predicting Visitor Purchases Using BigQuery ML

  • Analyze big data at scale with BigQuery
  • Learn how BigQuery processes queries and stores data at scale
  • Walkthrough key ML terms: features, labels, training data
  • Evaluate the different types of models for structured datasets
  • Create custom ML models with BigQuery ML

Module 4: Real-time Dashboards with Pub/Sub, Dataflow, and Google Data Studio

  • Identify modern data pipeline challenges and how to solve them at scale with Dataflow
  • Design streaming pipelines with Apache Beam
  • Build collaborative real-time dashboards with Data Studio

Module 5: Deriving Insights from Unstructured Data using Machine Learning

  • Evaluate how businesses use unstructured ML models and how the models work
  • Choose the right approach for machine learning models between pre-built and custom
  • Create a high-performing custom image classification model with no code using AutoML

Module 6: Summary

  • Recap of key learning points
  • Resources