Why Become a Certified Google Cloud Architect?
by Doug Rehnstrom

The life of a Cloud Architect…

A software architect’s job is to draw rectangles with arrows pointing at them.  

A cloud architect’s job is a little more complicated. First, you draw a computer, a phone, and a tablet on the left. Then, you draw a cloud. Then, you draw some rectangles to the right of the cloud. Lastly, you point arrows at things. Some architects will get fancy and strategically place a cylinder or two on the drawing. They might even draw rectangles within rectangles! Like this:

Sounds easy right? The trick is, you have to label the rectangles.  

If your only tool is a hammer, then every problem is a nail.

If you want to start using Google Cloud Platform, you might tell your IT guys to learn about Google Cloud infrastructure. They would likely go off and learn about Compute Engine and Networking. Then, they might fill in the rectangles as shown below:

If you told some programmers, go learn how to program on Google Cloud Platform, they might fill in the rectangles as shown here:

Both drawings might be “correct” in the sense that we could use either to get a system up and running. The question is, are we optimizing the use of the platform if we are only using one or two services?

Platform… what is he talking about?

Google Cloud Platform has many services: Compute Engine, App Engine, Dataflow, Dataproc, BigQuery, Pub/Sub, BigTable, and many more. To be a competent Google Cloud Platform Architect, you have to know what the services are, what they are intended to be used for, what they cost, and how to combine them to create solutions that are optimized. Optimized for what?  Cost, scalability, availability, durability, performance, etc.  

When someone takes the Google Cloud Architect certification exam, they are being tested on their ability to architect optimal systems. They are being tested on whether they know which service to use for which use cases. They are being tested on whether they can design a system that meets application performance requirements at the lowest cost.

Why being certified is important to your company.

Recently, a guy was complaining about his 4 million dollar per year bill for his Hadoop cluster running on GCP. He didn’t have to be spending that much. A bit of architecture training could have saved his company, oh I don’t know, 3.5 million dollars!

Send your IT people and your programmers to my Google Cloud Architect Certification Workshop. I’ll show them the right way to label the rectangles and help them pass the exam. Maybe we can even save you some money.



Understanding Denormalization for BigQuery
by Doug Rehnstrom

Understanding Denormalization for BigQuery

A long time ago in a galaxy far, far away...

In order to understand denormalization, we need to take a trip back in time; back to the last century. This was a time when CPU speeds were measured in MegaHertz and hard drives were sold by the MegaByte. Passing ruffians were sometimes seen saying “neee” to old ladies, and modems made funny noises when connecting online services. ​Oh these were dark days.

In these ancient times we normalized our databases. But why? It was simple really. Hard drive space was expensive and computers were slow. When saving data, we wanted to use as little space as possible, and data retrieval had to use as little compute power as possible. Normalization saved space. Data separated into many tables could be combined in different ways for flexible data retrieval. Indexes made querying from multiple tables fast and efficient.

It was, however, complicated. Sometimes databases were poorly designed. Other times, data requirements changed over time, causing a good design to deteriorate. Sometimes there were conflicting data requirements. A proper design for one use case might be a poor design for a different use case. And what about when you wanted to combine data from different databases or data sources that were not relational? Oh the humanity...

Neanderthals developed tools...

Then Google said, “Go west young man and throw hardware at the problem.”
“What do you  mean?” asked the young prince.    

If hard drives are slow, just connect a lot of them together, and combined, they will be faster. And don’t worry about indexes. Just dump the data in files and read the files with a few thousand computers.      

And the road to BigQuery was paved... 

Run, Forest, Run! 

When data is put in BigQuery, each field is stored in a separate file on a massively distributed  file system. Storage is cheap; only a couple cents per GB per month. Storage is plentiful; there  is no limit to the amount a data that can be put into BigQuery. Storage is fast, super fast!  Terabytes can read in seconds. 

Data processing is done on a massive cluster of computers which are separate from storage. Storage and compute are connected with a Petabit network. Processing is fast and plentiful. If  the job is big, just use more machines!   

Danger, Will Robinson! 

Ah, but there is a caveat. Joins are expensive in BigQuery. It doesn’t mean you can’t do a join, it  just means there might be a more efficient way.    

Denormalize the data! ​(​said in a loud, deep voice with a little echo​)   

BigQuery tables support nested, hierarchical data. So, in addition to the usual data types like  strings, numbers, and booleans, fields can be records which are composite types made up of  multiple fields. Fields can also be repeated like an array. Thus, a field can be an array of a  complex type. So, you don’t need two tables for a one-to-many relationship. Mash it together  into one table.   

Don’t store your orders in one table and order details in a different table. In the Orders table,  create a field called Details, which is an array of complex records containing all the information  about each item ordered. Now there is no need for a join, which means we can run an efficient  query without an index.   

“But doesn’t this make querying less flexible?”, asked the young boy in the third row with the  glasses and Steph Curry tee-shirt.    

Yes, I guess that’s true. But storage is cheap and plentiful. So, just store the data multiple times  in different ways when you want to query it in different ways.   

“Heresy” screamed the old man as he was dragged away clinging to his ergonomic keyboard  and trackball. 

Preparing for the Google Cloud Architect Certification – Getting Started
by Doug Rehnstrom


Cloud computing is all the rage these days, and getting certified is a great way to demonstrate your proficiency.  

The Google Certified Professional - Cloud Architect exam itself is very effective in making sure you are a practitioner, not just someone who memorized a bunch of terms and statistics. This is a true test of experience and solution-oriented thinking.

Here are some tips for preparing.


If you are new to Google Cloud Platform (GCP)

Some think cloud computing is just virtual machines running in someone else’s data center. While that is part of it, there is a lot more. Being a competent architect requires a broad understanding cloud services.   

To get an overview of the products offered by GCP, go to https://cloud.google.com. Spend a couple hours exploring the links and reading the case studies.


There’s no substitute for getting your hands dirty

Go to https://console.cloud.google.com, log in with your Google or GMail account, and then sign up for the free trial. You will get a $300 credit that you can use to learn GCP. This credit is good for a year.

Once you have an account, do the following labs (don’t worry if you don’t understand everything).


Take a Class

The first class to take is a 1-day overview titled Google Cloud Fundamentals: Core Infrastructure.

To help defray the cost, when you register for the course, use the promotional code DOUG and you will get in for $99 dollars.

You can also take this course for free on Coursera, https://www.coursera.org/learn/gcp-fundamentals.

The second course is Architecting with Google Cloud Platform: Infrastructure. Come to the first course, and I will make sure you get a big discount on the second one too!

Soon we will also be releasing two new Google Cloud Certification Workshops. Stayed tuned...


Next Steps

I’ve reached my word count for one post. Get these things done and I’ll follow up with another in a little while. 

Let me know how you are doing, doug@roitraining.com.


Preparing for
the Gig Economy
by Steve Blais

The “gig economy”. That is supposedly where we are headed. There are predictions based on a study by Intuit that by 2020 over 40% of the workforce will be working “gigs” rather than full time permanent employment. This is nothing new. Thirty years ago, Tom Peters predicted the “Corporation of One” in which everyone working for a company would be consultants rather than employees.


Introspection in Python 2.7
by Arthur Messenger

I was reading about introspection in Python 2.7 and came across code similar to what is shown in Figure 1: class Bag. It turns out that it is a very interesting passage of code for me as it uses many of what I would consider intermediate Python techniques. In Part I of this blog post, I will cover some of the interesting Python constructs that crossed my mind when looking at the code. Part II is a short explanation of introspection in Python 2.7.

1 class Bag:
2     def __init__(self, **d):
3         for k,v in d.iteritems():
4             exec("self.%s = %s" % (k,v))

Figure 1: class Bag (figure numbers are only for reference)


Porting from Standard GAE to Managed VM: Part I
by Arthur Messenger

It started out so simple. I found this little module, RandomWords, for generating random words or word lists and I wanted to show how to add this package to the Standard GAE environment (GAE). So I modified the GAE HelloWorld app to say "Hello <random_word>".  This did not work. GAE only allows you to import modules that are written in pure Python and RandomWords compiles a C-shared object as part of its install. Now curious, I found a module, names, written in pure Python that generates random first names, last name, or full names based on the 1990 US Census data (http://www.census.gov/main/www/cen1990.html). This blog post covers what I did to make this work.