0

3 Reasons to Choose Google Cloud Platform
by Doug Rehnstrom

While teaching Google Cloud Onboard events in DC and New York the last couple weeks, I  was coincidentally asked the same thing at both events. “Give me a few technical reasons why I would choose Google over other cloud providers.”

1. The Network

When you create a network in Google Cloud Platform, at first it looks like all the other cloud providers. A network in a collection of subnets. When creating virtual machines, you pick a zone and that zone determines which subnet that machine is in. Just like all the other providers, right? Wrong.

In GCP, networks are global and subnets are regional. With everyone else, networks are regional and subnets are zonal. What does that mean to me?

This allows you to put machines in data centers all over the world and treat them as if they were all on the same LAN. Machines in Asia can communicate with machines in the US via their internal IPs. This makes high performance, worldwide networking, high availability, and disaster recovery easy. You can simply deploy resources in multiple regions within the same project.

Because networks are global, you can create load balancers that balance traffic to machines all over the world. Google’s load balancers automatically route requests to machines closest to the user without you configuring anything. It just works the way it should work.

Google owns all the fiber connecting their data centers together. Once you are in the network, you can pass data between data centers without leaving the network.

2. App Engine

A student asked what management tools Google provides to help them manage their applications which require thousands of virtual machines.

Well, the short answer is, you don’t need to manage your machines at all. App Engine will do it for you.

App Engine deployment is completely automated with a single command. Applications contain one or more services. Multiple versions of each service can exist simultaneously. You can split traffic between versions for A/B testing. When deploying new versions, there is zero downtime. You can rollback to older versions in a second if you ever need to.

Auto scaling is completely automated. Instances start in a couple hundred milliseconds. Because instances start so quickly, App Engine applications can scale to zero instances when there is no traffic. When an application has zero instances, you are charged nothing. Thus, you don’t have to worry about stopping old versions of services over time because they clean themselves up. App Engine is designed to run at Google’s scale, which means it runs at everyone’s scale.

Load balancing is completely automated. You don’t configure anything, it just works. Health checks are completely automated. All requests are queued automatically, so you don’t have to worry about that. App Engine includes a free caching service, so you don’t have to set that up.

While other providers offer competing products, there really is nothing else like App Engine.

3. Security

All data stored in GCP is encrypted by default. There’s nothing to configure and you couldn’t turn encryption off if you wanted to. Files are not saved onto a single disk, files are divided into chucks and the chunks are saved onto different physical disks in a massively distributed file system.

All data passed between services within the network is also encrypted. Because Google owns all the fiber connecting its data centers, traffic between regions doesn’t leave the network.

Because you are running on the same infrastructure Google uses, you get their network security for free. So, denial of service and intrusion detection are just there.

For more details on Google Security, read the documentation at: https://cloud.google.com/security/.

0

Why Become a Certified Google Cloud Architect?
by Doug Rehnstrom

The life of a Cloud Architect…

A software architect’s job is to draw rectangles with arrows pointing at them.  

A cloud architect’s job is a little more complicated. First, you draw a computer, a phone, and a tablet on the left. Then, you draw a cloud. Then, you draw some rectangles to the right of the cloud. Lastly, you point arrows at things. Some architects will get fancy and strategically place a cylinder or two on the drawing. They might even draw rectangles within rectangles! Like this:

Sounds easy right? The trick is, you have to label the rectangles.  

If your only tool is a hammer, then every problem is a nail.

If you want to start using Google Cloud Platform, you might tell your IT guys to learn about Google Cloud infrastructure. They would likely go off and learn about Compute Engine and Networking. Then, they might fill in the rectangles as shown below:

If you told some programmers, go learn how to program on Google Cloud Platform, they might fill in the rectangles as shown here:

Both drawings might be “correct” in the sense that we could use either to get a system up and running. The question is, are we optimizing the use of the platform if we are only using one or two services?

Platform… what is he talking about?

Google Cloud Platform has many services: Compute Engine, App Engine, Dataflow, Dataproc, BigQuery, Pub/Sub, BigTable, and many more. To be a competent Google Cloud Platform Architect, you have to know what the services are, what they are intended to be used for, what they cost, and how to combine them to create solutions that are optimized. Optimized for what?  Cost, scalability, availability, durability, performance, etc.  

When someone takes the Google Cloud Architect certification exam, they are being tested on their ability to architect optimal systems. They are being tested on whether they know which service to use for which use cases. They are being tested on whether they can design a system that meets application performance requirements at the lowest cost.

Why being certified is important to your company.

Recently, a guy was complaining about his 4 million dollar per year bill for his Hadoop cluster running on GCP. He didn’t have to be spending that much. A bit of architecture training could have saved his company, oh I don’t know, 3.5 million dollars!

Send your IT people and your programmers to my Google Cloud Architect Certification Workshop. I’ll show them the right way to label the rectangles and help them pass the exam. Maybe we can even save you some money.

 

0

Understanding Denormalization for BigQuery
by Doug Rehnstrom

Understanding Denormalization for BigQuery

 
A long time ago in a galaxy far, far away...

In order to understand denormalization, we need to take a trip back in time; back to the last century. This was a time when CPU speeds were measured in MegaHertz and hard drives were sold by the MegaByte. Passing ruffians were sometimes seen saying “neee” to old ladies, and modems made funny noises when connecting online services. ​Oh these were dark days.

In these ancient times we normalized our databases. But why? It was simple really. Hard drive space was expensive and computers were slow. When saving data, we wanted to use as little space as possible, and data retrieval had to use as little compute power as possible. Normalization saved space. Data separated into many tables could be combined in different ways for flexible data retrieval. Indexes made querying from multiple tables fast and efficient.

It was, however, complicated. Sometimes databases were poorly designed. Other times, data requirements changed over time, causing a good design to deteriorate. Sometimes there were conflicting data requirements. A proper design for one use case might be a poor design for a different use case. And what about when you wanted to combine data from different databases or data sources that were not relational? Oh the humanity...

Neanderthals developed tools...

Then Google said, “Go west young man and throw hardware at the problem.”
“What do you  mean?” asked the young prince.    

If hard drives are slow, just connect a lot of them together, and combined, they will be faster. And don’t worry about indexes. Just dump the data in files and read the files with a few thousand computers.      

And the road to BigQuery was paved... 

Run, Forest, Run! 

When data is put in BigQuery, each field is stored in a separate file on a massively distributed  file system. Storage is cheap; only a couple cents per GB per month. Storage is plentiful; there  is no limit to the amount a data that can be put into BigQuery. Storage is fast, super fast!  Terabytes can read in seconds. 

Data processing is done on a massive cluster of computers which are separate from storage. Storage and compute are connected with a Petabit network. Processing is fast and plentiful. If  the job is big, just use more machines!   

Danger, Will Robinson! 

Ah, but there is a caveat. Joins are expensive in BigQuery. It doesn’t mean you can’t do a join, it  just means there might be a more efficient way.    

Denormalize the data! ​(​said in a loud, deep voice with a little echo​)   

BigQuery tables support nested, hierarchical data. So, in addition to the usual data types like  strings, numbers, and booleans, fields can be records which are composite types made up of  multiple fields. Fields can also be repeated like an array. Thus, a field can be an array of a  complex type. So, you don’t need two tables for a one-to-many relationship. Mash it together  into one table.   

Don’t store your orders in one table and order details in a different table. In the Orders table,  create a field called Details, which is an array of complex records containing all the information  about each item ordered. Now there is no need for a join, which means we can run an efficient  query without an index.   

“But doesn’t this make querying less flexible?”, asked the young boy in the third row with the  glasses and Steph Curry tee-shirt.    

Yes, I guess that’s true. But storage is cheap and plentiful. So, just store the data multiple times  in different ways when you want to query it in different ways.   

“Heresy” screamed the old man as he was dragged away clinging to his ergonomic keyboard  and trackball. 

Preparing for the Google Cloud Architect Certification – Getting Started
by Doug Rehnstrom

 

Cloud computing is all the rage these days, and getting certified is a great way to demonstrate your proficiency.  

The Google Certified Professional - Cloud Architect exam itself is very effective in making sure you are a practitioner, not just someone who memorized a bunch of terms and statistics. This is a true test of experience and solution-oriented thinking.

Here are some tips for preparing.

 

If you are new to Google Cloud Platform (GCP)

Some think cloud computing is just virtual machines running in someone else’s data center. While that is part of it, there is a lot more. Being a competent architect requires a broad understanding cloud services.   

To get an overview of the products offered by GCP, go to https://cloud.google.com. Spend a couple hours exploring the links and reading the case studies.

 

There’s no substitute for getting your hands dirty

Go to https://console.cloud.google.com, log in with your Google or GMail account, and then sign up for the free trial. You will get a $300 credit that you can use to learn GCP. This credit is good for a year.

Once you have an account, do the following labs (don’t worry if you don’t understand everything).

 

Take a Class

The first class to take is a 1-day overview titled Google Cloud Fundamentals: Core Infrastructure.

To help defray the cost, when you register for the course, use the promotional code DOUG and you will get in for $99 dollars.

You can also take this course for free on Coursera, https://www.coursera.org/learn/gcp-fundamentals.

The second course is Architecting with Google Cloud Platform: Infrastructure. Come to the first course, and I will make sure you get a big discount on the second one too!

Soon we will also be releasing two new Google Cloud Certification Workshops. Stayed tuned...

 

Next Steps

I’ve reached my word count for one post. Get these things done and I’ll follow up with another in a little while. 

Let me know how you are doing, doug@roitraining.com.

0

Preparing for
the Gig Economy
by Steve Blais

The “gig economy”. That is supposedly where we are headed. There are predictions based on a study by Intuit that by 2020 over 40% of the workforce will be working “gigs” rather than full time permanent employment. This is nothing new. Thirty years ago, Tom Peters predicted the “Corporation of One” in which everyone working for a company would be consultants rather than employees.

0

Introspection in Python 2.7
by Arthur Messenger

I was reading about introspection in Python 2.7 and came across code similar to what is shown in Figure 1: class Bag. It turns out that it is a very interesting passage of code for me as it uses many of what I would consider intermediate Python techniques. In Part I of this blog post, I will cover some of the interesting Python constructs that crossed my mind when looking at the code. Part II is a short explanation of introspection in Python 2.7.

1 class Bag:
2     def __init__(self, **d):
3         for k,v in d.iteritems():
4             exec("self.%s = %s" % (k,v))

Figure 1: class Bag (figure numbers are only for reference)