Cloud without Kubernetes

I think it’s safe to say Kubernetes has “won” the cloud mindshare game. If you look at the CNCF Cloud Native landscape (and manage to not go cross eyed), it seems like most of the projects are somehow related to Kubernetes. KubeCon is one of the fastest-growing industry events. Companies we talk to at Real Kinetic who are either preparing for or currently executing migrations to the cloud are centering their strategies around Kubernetes. Those already in the cloud are investing heavily in platform-izing their Kubernetes environment. Kubernetes competitors like Nomad, Pivotal Cloud Foundry, OpenShift, and Rancher have sort of just faded to the background (or simply pivoted to Kubernetes). In many ways, “cloud native” seems to be equated with “Kubernetes”.

All this is to say, the industry has coalesced around Kubernetes as the way to do cloud. But after working with enough companies doing cloud, watching their experiences, and understanding their business problems, I can’t help but wonder: should it be? Or rather, is Kubernetes actually the right level of abstraction?

Going k8sless

While we’ve worked with a lot of companies doing Kubernetes, we’ve also worked with some that are deliberately not. Instead, they leaned into serverless—heavily—or as I like to call it, they’ve gone k8sless. These are not small companies or startups, they are name brands you would recognize.

At first, we were skeptical. Our team came from a company that made it all the way to IPO using Google App Engine, one of the earliest serverless platforms available. We have regularly espoused the benefits of serverless. We’ve talked to clients about how they should consider it for their own workloads (often to great skepticism). But using only serverless? For once, we were the serverless skeptics. One client in particular was beginning a migration of their e-commerce platform to Google Cloud. They wanted to do it completely serverless. We gave our feedback and recommendations based on similar migrations we’ve performed:

“There are workloads that aren’t a good fit.”

“It would require major re-architecting.”

“It will be expensive once fully migrated.”

“You’ll have better cost efficiency bin packing lots of services into VMs with Kubernetes.”

We articulated all the usual arguments made by the serverless doubters. Even Google was skeptical, echoing our sentiments to the customer. “Serious companies doing online retail like The Home Depot or Target are using Google Kubernetes Engine,” was more or less the message. We have a team of serverless experts at Real Kinetic though, so we forged ahead and helped execute the migration.

Fast forward nearly three years later and we will happily admit it: we were wrong. You can run a multibillion-dollar e-commerce platform without a single VM. You don’t have to do a full rewrite or major re-architecting. It can be cost-effective. It doesn’t require proprietary APIs or constraints that result in vendor lock-in. It might sound like an exaggeration, but it’s not.

Container as the interface

Over the last several years, Google’s serverless offerings have evolved far beyond App Engine. It has reached the point where it’s now viable to run a wide variety of workloads without much issue. In particular, Cloud Run offers many of the same benefits of a PaaS like App Engine without the constraints. If your code can run in a container, there’s a very good chance it will run on Cloud Run with little to no modification.

In fact, other than using the gcloud CLI to deploy a service, there’s nothing really Google- or Cloud Run-specific needed to get a functioning application. This is because Cloud Run uses Knative, an open-source Kubernetes-based platform, as its deployment interface. And while Cloud Run is a Google-managed backend for the Knative interface, we could just as well switch the backend to GKE or our own Kubernetes cluster. When we implement our Cloud Run services, we actually implement them using a Kubernetes Deployment manifest, shown below, and right before deploying, we swap Deployment for Knative’s Service manifest.

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    cloud.googleapis.com/location: us-central1
    service: my-service
  name: my-service
spec:
  template:
    spec:
      containers:
        - image: us.gcr.io/my-project/my-service:v1
          name: my-service
          ports:
            - containerPort: 8080
          resources:
            limits:
              cpu: 2
              memory: 1024Mi

This means we can deploy to Kubernetes without Knative at all, which we often do during development using the combination of Skaffold and K3s to perform local testing. It also allows us to use Kubernetes native tooling such as Kustomize to manage configuration. Think of Cloud Run as a Kubernetes Deployment as a service (though really more like Deployment and Service…as a service).

“Normal” businesses versus internet-scale businesses

What about cost? Yes, the unit cost in terms of compute is higher with serverless. If you execute enough CPU cycles to fill the capacity of a VM, you are better off renting the whole VM as opposed to effectively renting timeshares of it. But here’s the thing: most “normal” businesses tend to have highly cyclical traffic patterns throughout the day and their scale is generally modest.

What do I mean by “normal” businesses? These are primarily non-internet-scale companies such as insurance, fast food, car rental, construction, or financial services, not Google, Netflix, or Amazon. As a result, these companies can benefit greatly from pay-per-use, and those in the retail space also benefit greatly from the elasticity of this model during periods like Black Friday or promotional campaigns. Businesses with brick-and-mortar have traffic that generally follows their operating hours. During off-hours, they can often scale quite literally to zero.

Many of these businesses, for better or worse, treat software development as an IT cost center to be managed. They don’t need—or for that matter, want—the costs and overheads associated with platform-izing Kubernetes. A lot of the companies we interact with fall into this category of “normal” businesses, and I suspect most companies outside of tech do as well.

BYOP—Bring Your Own Platform

I’ve asked it before: is Kubernetes really the end-game abstraction? In my opinion, it’s an implementation detail. I don’t think I’m alone in that opinion. Some companies put a tremendous amount of investment into abstracting Kubernetes from their developers. This is what I mean by “platform-izing” Kubernetes. It typically involves significant and ongoing OpEx investment. The industry has started to coalesce around two concepts that encapsulate this: Platform Engineering and Internal Developer Platform. So while Kubernetes may have become the default container orchestrator, the higher-level pieces—the pieces constituting the Internal Developer Platform—are still very much bespoke. Kelsey Hightower said it best: the majority of people managing infrastructure just want a PaaS. The only requirement: it has to be built by them. That’s a problem.

Imagine having a Kubernetes cluster per Deployment. Full blast radius isolation, complete cost traceability, granular yet simple permissioning. It sounds like a maintenance nightmare though, right? Now imagine those clusters just being hidden from you completely and the Deployment itself is the only thing you interact with and maintain. You just provide your container (or group of containers), configure your CPU and memory requirements, specify the network and resource access, and deploy it. The Deployment manages your load balancing and ingress, automatically scales the pods up and down or canaries traffic, and gives you aggregated logs and metrics out of the box. You only pay for the resources consumed while processing a request. Just a few years ago, this was a futuristic-sounding fantasy.

The platform Kelsey describes above does now exist. From my experience, it’s a nearly ideal solution for those “normal” businesses who are looking to minimize complexity and operational costs and avoid having to bring (more like build) their own platform. I realize GCP is a distant third when it comes to public cloud market share so this will largely fall on deaf ears, but for those who are still listening: stop wasting time on Kubernetes and just use Cloud Run. Let me expand on the reasons why.

  1. Easily and quickly get started with the cloud. Many of the companies we work with who are still in the midst of migrating to the cloud get hung up with analysis paralysis. Cloud Run isn’t a perfect solution for everything, but it’s good enough for the majority of cases. The rest can be handled as exceptions.

  2. Minimize complexity of cloud environments. Cloud Run does not eliminate the need for infrastructure (there are still caches, queues, databases, and so forth), but it greatly simplifies it. Using managed services for the remaining infrastructure pieces simplifies it further.

  3. Increase the efficiency of your developers and reduce operational costs. Rather than spending most of their time dealing with infrastructure concerns, allow your developers to focus on delivering business value. For most businesses, infrastructure is undifferentiated commodity work. By “outsourcing” large parts of your undifferentiated Internal Developer Platform, you can reallocate developers to product or feature development and reduce operational costs. This allows you to get the benefits of Platform Engineering with a fraction of the maintenance and overhead. Lastly, if you are a “normal” business that doesn’t operate at internet scale and has fairly cyclical traffic, it’s entirely likely Cloud Run will be cheaper than VM-based platforms.

  4. Maintain the flexibility to evolve to a more complex solution over time if needed. This is where traditional serverless platforms and PaaS solutions fall short. Again, with Cloud Run there is no actual vendor lock-in, it’s just a Kubernetes Deployment as a Service. Even without Knative, we can take that Deployment and run it in any Kubernetes cluster. This is a very different paradigm from, say, App Engine where you wrote your application using App Engine APIs and deployed your service to the App Engine runtime. In this new paradigm, the artifact is a Plain Old Container. There are cases where Cloud Run is not a good fit, such as certain kinds of stateful legacy applications or services with sustained, non-cyclical traffic. We don’t want to be painted into a corner with these types of situations so having flexibility is important.

There are similar analogs to Cloud Run on other cloud platforms. For example, AWS has AppRunner. However, in my experience these fall short in terms of developer experience because of either lack of investment from the cloud provider or environment complexity (as I would argue is the case for AWS). Managed services like Cloud Run are one of the areas that GCP truly excels and differentiates itself.

Just use Cloud Run, seriously

I realize not everyone will be convinced. The gravitational pull of Kubernetes is strong and as a platform, it’s a safe bet. However, operationalizing Kubernetes properly—whether it’s a managed offering like GKE or not—requires some kind of platform team and ongoing investment. We’ve seen it approached without this where developers are given clusters or allowed to spin them up and fend for themselves. This quickly becomes untenable because standards are non-existent, security and compliance is unmanageable, and developer time is split between managing infrastructure and actual feature development.

If your organization is unable or unwilling to make this investment, I urge you to consider Cloud Run. There’s still work needed on the periphery to properly operationalize it, such as implementing CI/CD pipelines and managing accessory infrastructure, but it’s a much lower investment. Additionally, it provides an escape hatch—unlike App Engine or traditional PaaS solutions, there is no real switching cost in moving to Kubernetes if you need to in the future. With Cloud Run, serverless has finally reached a tipping point where it’s now viable for a majority of workloads rather than a niche subset. Unlike Kubernetes, it provides the right level of abstraction for most businesses building software. In my opinion, serverless is still not taken seriously due to preconceived notions, but it’s time to start reevaluating those notions.

Agree? Disagree? I’d love to hear your thoughts. If you’re an organization that would like to do cloud differently or are looking for the playbook to operationalize Google Cloud Platform, please get in touch.

Using Google-Managed Certificates and Identity-Aware Proxy With GKE

Ingress on Google Kubernetes Engine (GKE) uses a Google Cloud Load Balancer (GCLB). GCLB provides a single anycast IP that fronts all of your backend compute instances along with a lot of other rich features. In order to create a GCLB that uses HTTPS, an SSL certificate needs to be associated with the ingress resource. This certificate can either be self-managed or Google-managed. The benefit of using a Google-managed certificate is that they are provisioned, renewed, and managed for your domain names by Google. These managed certificates can also be configured directly with GKE, meaning we can configure our certificates the same way we declaratively configure our other Kubernetes resources such as deployments, services, and ingresses.

GKE also supports Identity-Aware Proxy (IAP), which is a fully managed solution for implementing a zero-trust security model for applications and VMs. With IAP, we can secure workloads in GCP using identity and context. For example, this might be based on attributes like user identity, device security status, region, or IP address. This allows users to access applications securely from untrusted networks without the need for a VPN. IAP is a powerful way to implement authentication and authorization for corporate applications that are run internally on GKE, Google Compute Engine (GCE), or App Engine. This might be applications such as Jira, GitLab, Jenkins, or production-support portals.

IAP works in relation to GCLB in order to secure GKE workloads. In this tutorial, I’ll walk through deploying a workload to a GKE cluster, setting up GCLB ingress for it with a global static IP address, configuring a Google-managed SSL certificate to support HTTPS traffic, and enabling IAP to secure access to the application. In order to follow along, you’ll need a GKE cluster and domain name to use for the application. In case you want to skip ahead, all of the Kubernetes configuration for this tutorial is available here.

Deploying an Application Behind GCLB With a Managed Certificate

First, let’s deploy our application to GKE. We’ll use a Hello World application to test this out. Our application will consist of a Kubernetes deployment and service. Below is the configuration for these:

Apply these with kubectl:

$ kubectl apply -f .

At this point, our application is not yet accessible from outside the cluster since we haven’t set up an ingress. Before we do that, we need to create a static IP address using the following command:

$ gcloud compute addresses create web-static-ip --global

The above will reserve a static external IP called “web-static-ip.” We now can create an ingress resource using this IP address. Note the “kubernetes.io/ingress.global-static-ip-name” annotation in the configuration:

Applying this with kubectl will provision a GCLB that will route traffic into our service. It can take a few minutes for the load balancer to become active and health checks to begin working. Traffic won’t be served until that happens, so use the following command to check that traffic is healthy:

$ curl -i http://<web-static-ip>

You can find <web-static-ip> with:

$ gcloud compute addresses describe web-static-ip --global

Once you start getting a successful response, update your DNS to point your domain name to the static IP address. Wait until the DNS change is propagated and your domain name now points to the application running in GKE. This could take 30 minutes or so.

After DNS has been updated, we’ll configure HTTPS. To do this, we need to create a Google-managed SSL certificate. This can be managed by GKE using the following configuration:

Ensure that “example.com” is replaced with the domain name you’re using.

We now need to update our ingress to use the new managed certificate. This is done using the “networking.gke.io/managed-certificates” annotation.

We’ll need to wait a bit for the certificate to finish provisioning. This can take up to 15 minutes. Once it’s done, we should see HTTPS traffic flowing correctly:

$ curl -i https://example.com

We now have a working example of an application running in GKE behind a GCLB with a static IP address and domain name secured with TLS. Now we’ll finish up by enabling IAP to control access to the application.

Securing the Application With Identity-Aware Proxy

If you’re enabling IAP for the first time, you’ll need to configure your project’s OAuth consent screen. The steps here will walk through how to do that. This consent screen is what users will see when they attempt to access the application before logging in.

Once IAP is enabled and the OAuth consent screen has been configured, there should be an OAuth 2 client ID created in your GCP project. You can find this under “OAuth 2.0 Client IDs” in the “APIs & Services” > “Credentials” section of the cloud console. When you click on this credential, you’ll find a client ID and client secret. These need to be provided to Kubernetes as secrets so they can be used by a BackendConfig for configuring IAP. Apply the secrets to Kubernetes with the following command, replacing “xxx” with the respective credentials:

$ kubectl create secret generic iap-oauth-client-id \
--from-literal=client_id=xxx \
--from-literal=client_secret=xxx

BackendConfig is a Kubernetes custom resource used to configure ingress in GKE. This includes features such as IAP, Cloud CDN, Cloud Armor, and others. Apply the following BackendConfig configuration using kubectl, which will enable IAP and associate it with your OAuth client credentials:

We also need to ensure there are service ports associated with the BackendConfig in order to trigger turning on IAP. One way to do this is to make all ports for the service default to the BackendConfig, which is done by setting the “beta.cloud.google.com/backend-config” annotation to “{“default”: “config-default”}” in the service resource. See below for the updated service configuration.

Once you’ve applied the annotation to the service, wait a couple minutes for the infrastructure to settle. IAP should now be working. You’ll need to assign the “IAP-secured Web App User” role in IAP to any users or groups who should have access to the application. Upon accessing the application, you should now be greeted with a login screen.

Your Kubernetes workload is now secured by IAP! Do note that VPC firewall rules can be configured to bypass IAP, such as rules that allow traffic internal to your VPC or GKE cluster. IAP will provide a warning indicating which firewall rules allow bypassing it.

For an extra layer of security, IAP sets signed headers on inbound requests which can be verified by the application. This is helpful in the event that IAP is accidentally disabled or misconfigured or if firewall rules are improperly set.

Together with GCLB and GCP-managed certificates, IAP provides a great solution for serving and securing internal applications that can be accessed anywhere without the need for a VPN.

There and Back Again: Why PaaS Is Passé (And Why It’s Not)

In 10 years nobody will be talking about Kubernetes. Not because people stopped using it or because it fell out of favor, but because it became utility. Containers, Kubernetes, service meshes—they’ll all be there, the same way VMs, hypervisors, and switches will be. Compute is a commodity, and I don’t care how my workload runs so long as it meets my business’s SLOs and other requirements. Within AWS alone, there are now innumerable ways to run a compute workload.

This was the promise of Platform as a Service (PaaS): provide a pre-built runtime where you simply plug in your application and the rest—compute, networking, storage—is handled for you. Heroku (2007), Google App Engine (2008), OpenShift (2011), and Cloud Foundry (2011) all come to mind. But PaaS has, in many ways, become a sort of taboo in recent years. As a consultant working with companies either in the cloud or looking to move to the cloud, I’ve found PaaS to almost be a trigger word; the wince from clients upon its utterance is almost palpable. It’s hard to pin down exactly why this is the case, but I think there are a number of reasons which range from entirely legit to outright FUD.

There is often a funny cognitive dissonance with these companies who recoil at the mention of PaaS. After unequivocally rejecting the idea for reasons like vendor lock-in and runtime restrictions (again, some of these are legitimate concerns), they will describe, in piecemeal fashion, their own half-baked idea of a PaaS. “Well, we’ll use Kubernetes to handle compute, ELK stack for logging, Prometheus for metrics, OpenTracing for distributed tracing, Redis for caching…”, and so the list goes on. Not to mention there tends to be a bias on build over buy. And we need to somehow provide all of these things as a self-service platform to developers.

While there are ongoing efforts to democratize the cloud and provide reference architectures of sorts, the fact is there are no standards and the proliferation of tools and technologies continues to expand at a rapid pace. On the other hand, as certain tools emerge, such as Kubernetes, the patterns and practices around them have naturally lagged behind. The serverless movement bears this out further. Serverless is the microservice equivalent for PaaS but with a lot less tooling and operations maturity. This is an exciting time, but the cloud has become—without a doubt—an unnavigable wasteland. Even with all the things at your disposal today, it’s still a ton of work to build and operate what is essentially your own PaaS.

But technology is cyclical and the cloud is no different. This evolution, in some sense, parallels what happened with the NoSQL movement. Eric Brewer discusses this in his RICON 2012 talk. When you cut through the hype, NoSQL was about giving developers more control at the expense of less pre-packaged functionality, but it was not intended to be the end game or an alternative to SQL. It’s about two different, equally valid world views: top-down and bottom-up. The top-down view is looking at a model and its semantics and then figuring out what you need to do to implement it. With a relational database, this is using SQL to declaratively construct our model. The bottom-up view is about the layering of primitive components into something more complex. For example, modern databases like CockroachDB present a SQL abstraction on top of a transactional layer on top of a replication layer on top of a simple key-value-store layer. NoSQL gives us a reusable storage component with a lot of flexibility and, over time, as we add more and more pieces on top, we get something that looks more like a database. We start with low-level layers, but the end goal is still the same: nice, user-friendly semantics. I would argue the same thing is happening with PaaS.

What the major cloud providers are doing is unbundling the PaaS. We have our compute, our cluster scheduler, our databases and caches, our message queues, and other components. What’s missing is the glue—the standards and tools that tie these things together into a coherent, manageable unit—a PaaS. Everything old is new again. What we will see is the rebundling of these components gradually happen over time as those standards and tools emerge. Tools like AWS Fargate and Google App Engine Flexible Environment are a step in that direction (Google really screwed up by calling it App Engine Flex because of all the PaaS baggage associated with the App Engine name). The container is just the interface. However, that’s only the start.

PaaS and serverless are great because they truly accelerate application development and reduce operations overhead. However, the trade-off is: we become constrained. For example, with App Engine, we were initially constrained to certain Google Cloud APIs, such as Cloud Datastore and Task Queues, and specific language runtimes. Over time, this has improved, notably, with Cloud SQL, and now today we can use custom runtimes. Similarly, PaaS gives us service autoscaling, high availability, and critical security patches for free, but we lose a degree of control over compute characteristics and workload-processing patterns.

In a sense, what a PaaS offers is an opinionated framework for running applications. Opinionated is good if you want to be productive, but it’s limiting once you have a mature product. What we want are the benefits of PaaS with a bit more flexibility. A PaaS provides us a top-down template from which we can start, but we want to be able to tweak that to our needs. Kubernetes is a key part of that template, but it’s ultimately just a means to an end.

This is why I think no one will be talking about Kubernetes in 10 years. Hopefully by then it’s just not that interesting. If it still is, we’re not done yet.