Course 793B:
Programming Microservice and Big Data Applications in the Cloud

(3 days)


Course Overview

This course teaches students how to integrate source control, automated testing, versioning, and automated deployment in an Agile, cloud-based environment. We cover microservice design and architecture and integrating cloud services into projects. In addition, we cover using the cloud to analyze big data and add big data analysis and machine learning to applications.

Learning Objectives

  • Develop a Continuous Deployment Pipeline on AWS and GCP
  • Integrate Unit Tests into cloud-based automated builds
  • Architect Microservices and Program REST Services in AWS and GCP
  • Deploy Hadoop and Spark Clusters on AWS and GCP
  • Customize ElasticMapReduce and Google Cloud Dataproc computing clusters to run PySpark and Jupyter notebooks
  • Leverage AWS Athena and GCP BigQuery for massively scalable, serverless Big Data analysis
  • Add Machine Learning capabilities like Vision, Speech, and Text processing to your applications
  • Develop Machine Learning models with Amazon Mine Learning and Google Cloud ML

Who Should Attend

Developers, analysts, programmers and architects working on cloud-based software projects or migrating existing applications to the cloud. Anyone who wants a hands-on introduction to using cloud services for big data analysis and machine learning will also benefit from this course.

Students can choose to do the exercises using either Amazon Web Services or Google Cloud Platform.


An understanding of cloud computing and SaaS development to the level of Course 793A: Deploying Applications on Amazon Web Services and Google Cloud Platform is assumed. Some programming and database development experience is also helpful as is a basic understanding of big data.

Students in this course can do exercises using Amazon Web Services, Google Cloud Platform, or both. To do this, they will need:

AWS and/or GCP accounts with rights to create cloud resources. Students can use a personal account. If using a company-provided account, they need administrative rights.

  • Unrestricted internet access
  • A computer with the latest version of Google Chrome
  • Students need to install the AWS Command line interface and/or the Google Cloud Platform SDK on their workstation
  • Student workstations should include some development tools for their preferred language (Eclipse, IntelliJ, PyCharm, etc.)

A pre-course exercise will be provided for students to configure their machines.

Course Outline

  • Continuous Integration
    • Source Control
    • Git
    • Exercise: Using Git in the Cloud
    • Package Managers
    • Build Server
    • Test Integration
    • Automated Deployment
    • Versioning
    • A/B Testing
    • Exercise: Continuous Integration

  • Test-Driven Development
    • Conditions of Satisfaction
    • Defining Acceptance Criteria
    • ..When…Then Format
    • Activity: Defining Conditions of Satisfaction and Acceptance Criteria
    • Testing Overview
    • Test Automation
    • Unit Testing
    • Mocking
    • Activity: Writing Unit Tests
    • Integration Testing
    • End-To-End Testing
    • Activity: End-To-End Testing
    • Code Quality Principles
    • Code Smells
    • Clean Code Principles
    • Activity: Review of Code Samples

  • Building Microservices
    • Service-Oriented Architecture
    • Microservice Architecture
    • Activity: Designing Microservices
    • REST
    • Microservice Boundaries
    • Domain-Driven Design
    • Versioning
    • Security
    • Organizing Microservices
    • Testing-Services
    • Mocking Services
    • Exercise: Programming Microservices in the Cloud 
  • Cloud APIs
    • Enabling APIs
    • API Examples
    • Discovery
    • Machine Learning APIs
    • Exercise: Integrating Cloud APIs into your Applications

  • Cloud Computing Big Data Services
    • Hadoop and Spark
    • HDFS
    • MapReduce
    • Pig
    • Hive
    • PySpark
    • Exercise: Creating Hadoop and Spark Clusters
    • iPython Notebooks
    • Jupyter
    • Exercise: Programming iPython Notebooks
    • ETL
    • Pipelines
    • Inputs and Outputs
    • Transformations
    • Big Data Analysis
    • Exercise: Leveraging the Cloud for Big Data Analysis

  • Real-Time Data Analysis
    • Introduction to Publish/Subscribe Messaging
    • Topics
    • Publishers
    • Subscribers (Push and Pull)
    • AWS Kinesis

  • Machine Learning
    • Machine Learning Overview
    • Data Collection
    • Regression
    • Classification
    • Neural Networks
    • Machine Learning Frameworks
    • Exercise: Training Machine Learning Models
    • Leveraging Existing Machine Learning APIs
    • Vision
    • Speech
    • Translation
    • Exercise: Integrating Machine Learning APIs into Applications

  • Course Summary
    • Homework

Please Contact Your ROI Representative to Discuss Course Tailoring!