Jupyter Notebook and Plotting
by Arthur Messenger

I have been using Jupyter Notebook instead of PowerPoint slides for some of the advanced courses ROI offers. Notebook have a lot features that make teaching much more interactive and productive.

As with any large package, there are things I would like to do that are not instantly available and a workaround needs to be created. This is a short blog of one such workaround.

The original Introduction to Machine Learning course had some 40 to 50 Python scripts showing how the different models work. (The current version has been divided into two different courses each with about 40 Python scripts.)  All of the scripts were designed and tested using scikit-learn.org tools with Anaconda 3’s Python on the bash command line and in Python 3 notebooks. We found only one glitch in moving to using %run cells inside of a Jupyter Notebook.

The program in Figure 1 has been modified for this blog post to show only the glitch being addressed.

This script displays the graphic in a separate window and writes to the command line the prompt “Press enter key to exit!” When showing multiple graphics, “exit” was replaced with “continue.”

When the code is moved to a Jupyter Notebook, the code works the same. See Figure 2: graphic.py Execute in Jupyter Notebook.

Jupyter displays the prompt, “Press enter key to exit!” followed by an input box.

The green arrow points to the * which indicates the program has not terminated. This is expected, as the program is waiting for the enter key to be tapped. All is more or less fine.

However, the question is, “Why the “Press enter key to exit!”? This is confusing, since we don’t know if the script exits or does Jupyter exit the notebook? This question is not needed so let’s get rid of it.

Remove line 36, the input line, and the program terminates and Jupyter moves to the next cell. Not so fine. I now have two different programs, one for the bash command line and one for the cell environment in Jupyter.

We need some way to tell what environment we are in. There are two solutions for this; both a little off the normal path.

The first one is to use a small function to check if sys.stdout is attached to a tty (terminal interface).  See Figure 3: is_command_line()

The try except block is needed as Jupyter Notebook does not implement .fileno().

The second one is to see if program is running interactively under ipython.  See Figure 4: is_ipython()

This one depends on get_ipython.config being False if the script is not running under ipython. The try except block is needed because get_ipython is not available when running python.

Both ways work. They are equally “hacky”. They should only be used if the environment is being changed. Which one to use will depend on the aims of the script.



3 Reasons to Choose Google Cloud Platform
by Doug Rehnstrom

While teaching Google Cloud Onboard events in DC and New York the last couple weeks, I  was coincidentally asked the same thing at both events. “Give me a few technical reasons why I would choose Google over other cloud providers.”

1. The Network

When you create a network in Google Cloud Platform, at first it looks like all the other cloud providers. A network in a collection of subnets. When creating virtual machines, you pick a zone and that zone determines which subnet that machine is in. Just like all the other providers, right? Wrong.

In GCP, networks are global and subnets are regional. With everyone else, networks are regional and subnets are zonal. What does that mean to me?

This allows you to put machines in data centers all over the world and treat them as if they were all on the same LAN. Machines in Asia can communicate with machines in the US via their internal IPs. This makes high performance, worldwide networking, high availability, and disaster recovery easy. You can simply deploy resources in multiple regions within the same project.

Because networks are global, you can create load balancers that balance traffic to machines all over the world. Google’s load balancers automatically route requests to machines closest to the user without you configuring anything. It just works the way it should work.

Google owns all the fiber connecting their data centers together. Once you are in the network, you can pass data between data centers without leaving the network.

2. App Engine

A student asked what management tools Google provides to help them manage their applications which require thousands of virtual machines.

Well, the short answer is, you don’t need to manage your machines at all. App Engine will do it for you.

App Engine deployment is completely automated with a single command. Applications contain one or more services. Multiple versions of each service can exist simultaneously. You can split traffic between versions for A/B testing. When deploying new versions, there is zero downtime. You can rollback to older versions in a second if you ever need to.

Auto scaling is completely automated. Instances start in a couple hundred milliseconds. Because instances start so quickly, App Engine applications can scale to zero instances when there is no traffic. When an application has zero instances, you are charged nothing. Thus, you don’t have to worry about stopping old versions of services over time because they clean themselves up. App Engine is designed to run at Google’s scale, which means it runs at everyone’s scale.

Load balancing is completely automated. You don’t configure anything, it just works. Health checks are completely automated. All requests are queued automatically, so you don’t have to worry about that. App Engine includes a free caching service, so you don’t have to set that up.

While other providers offer competing products, there really is nothing else like App Engine.

3. Security

All data stored in GCP is encrypted by default. There’s nothing to configure and you couldn’t turn encryption off if you wanted to. Files are not saved onto a single disk, files are divided into chucks and the chunks are saved onto different physical disks in a massively distributed file system.

All data passed between services within the network is also encrypted. Because Google owns all the fiber connecting its data centers, traffic between regions doesn’t leave the network.

Because you are running on the same infrastructure Google uses, you get their network security for free. So, denial of service and intrusion detection are just there.

For more details on Google Security, read the documentation at: https://cloud.google.com/security/.


Why Become a Certified Google Cloud Architect?
by Doug Rehnstrom

The life of a Cloud Architect…

A software architect’s job is to draw rectangles with arrows pointing at them.  

A cloud architect’s job is a little more complicated. First, you draw a computer, a phone, and a tablet on the left. Then, you draw a cloud. Then, you draw some rectangles to the right of the cloud. Lastly, you point arrows at things. Some architects will get fancy and strategically place a cylinder or two on the drawing. They might even draw rectangles within rectangles! Like this:

Sounds easy right? The trick is, you have to label the rectangles.  

If your only tool is a hammer, then every problem is a nail.

If you want to start using Google Cloud Platform, you might tell your IT guys to learn about Google Cloud infrastructure. They would likely go off and learn about Compute Engine and Networking. Then, they might fill in the rectangles as shown below:

If you told some programmers, go learn how to program on Google Cloud Platform, they might fill in the rectangles as shown here:

Both drawings might be “correct” in the sense that we could use either to get a system up and running. The question is, are we optimizing the use of the platform if we are only using one or two services?

Platform… what is he talking about?

Google Cloud Platform has many services: Compute Engine, App Engine, Dataflow, Dataproc, BigQuery, Pub/Sub, BigTable, and many more. To be a competent Google Cloud Platform Architect, you have to know what the services are, what they are intended to be used for, what they cost, and how to combine them to create solutions that are optimized. Optimized for what?  Cost, scalability, availability, durability, performance, etc.  

When someone takes the Google Cloud Architect certification exam, they are being tested on their ability to architect optimal systems. They are being tested on whether they know which service to use for which use cases. They are being tested on whether they can design a system that meets application performance requirements at the lowest cost.

Why being certified is important to your company.

Recently, a guy was complaining about his 4 million dollar per year bill for his Hadoop cluster running on GCP. He didn’t have to be spending that much. A bit of architecture training could have saved his company, oh I don’t know, 3.5 million dollars!

Send your IT people and your programmers to my Google Cloud Architect Certification Workshop. I’ll show them the right way to label the rectangles and help them pass the exam. Maybe we can even save you some money.



Preparing for
the Gig Economy
by Steve Blais

The “gig economy”. That is supposedly where we are headed. There are predictions based on a study by Intuit that by 2020 over 40% of the workforce will be working “gigs” rather than full time permanent employment. This is nothing new. Thirty years ago, Tom Peters predicted the “Corporation of One” in which everyone working for a company would be consultants rather than employees.


Porting from Standard GAE to Managed VM: Part I
by Arthur Messenger

It started out so simple. I found this little module, RandomWords, for generating random words or word lists and I wanted to show how to add this package to the Standard GAE environment (GAE). So I modified the GAE HelloWorld app to say "Hello <random_word>".  This did not work. GAE only allows you to import modules that are written in pure Python and RandomWords compiles a C-shared object as part of its install. Now curious, I found a module, names, written in pure Python that generates random first names, last name, or full names based on the 1990 US Census data (http://www.census.gov/main/www/cen1990.html). This blog post covers what I did to make this work.


HUB.DOCKER.COM and Deleting a Repository
by Arthur Messenger


If you have used the registry at hub.docker.com, you already know that the option to delete an individual tagged entry is not available on the interface.

From what I can tell, there isn’t a way to accomplish this, even in the REST interface. The best I have been able to do is to use the REST DELETE command to delete the repository. This means downloading any images I want to save, deleting the repository, recreating the repository, and uploading these saved images back to the repository.

The rest of this blog is what happened when I used the REST DELETE command to delete the repository.