Outages like the recent Google Cloud outage are nothing new. Despite this, outages like these continue to make headlines as their effects are felt on a global scale. In the case of this most recent Google Cloud outage, a variety of services were impacted including Gmail, YouTube, Snapchat, and even Nest thermostats. Luckily, the outage happened on a Sunday, limiting the impact on enterprises that rely on Google Cloud Platform services to run their businesses. Nonetheless, the outage was felt on a global scale – and many took to Twitter to express their outrage!
While we shouldn’t accept outages like these as commonplace, the reality is, they are going to happen and companies should be prepared when they do. Many tend to assume that cloud services should be 100% reliable and don’t do enough in terms of designing resiliency and redundancy into their cloud deployments.
As Kurt Marko emphasizes in his article, “Google Cloud Outage Caused Much Twitter Angst, but Provides Teachable Moment for Enterprises,” outages like these should be a catalyst for system and application architects to take a long look at the design of their cloud deployments. Marko offers three techniques for hardening applications against such failures:
Employing load balancers to monitor servers and reroute traffic from offline or degraded servers to those in either another data center (zone in cloud parlance) or region that can best handle it.
Deploying VMs and other so-called zonal resources in multiple regions (see this explanation of the geographic extent and redundancy of various GCP services for details).
Using a robust, distributed storage service such as Google Cloud SQL which replicates data across multiple zones in a region, and setting up replicas in other regions to protect against systemic network failures such as Sunday’s. Alternatively, use a multi-region database like Cloud Spanner. Also consider using Google’s dual-region option for its Cloud Storage object store, which is in beta.
Read more at: diginomica, “Google Cloud Outage Caused Much Twitter Angst, but Provides Teachable Moment for Enterprises”

