An Elevated View of the Tanzu Kubernetes Grid Service Architecture

June 29, 2020 Kendrick Coleman

Before getting started using the Tanzu Kubernetes Grid (TKG) Service for vSphere, it helps to have an understanding of the Kubernetes architecture and the underlying technology that makes it possible. In this post, by starting at the lowest layer and zooming out, we will paint a picture of how all these technologies are interconnected. This is an introduction, not a deep dive, and as such is meant to be a high-level overview for anyone getting started.

There are multiple layers to look at. We’ll begin by taking a look at the Kubernetes architecture itself. We won’t go deep into the services or how storage and networking work, but instead focus on the components that comprise the Kubernetes infrastructure. Then we’ll turn to the Cluster API architecture, and look at how it automates Kubernetes deployments. Finally, we’ll look at the vSphere with Kubernetes environment to see how all these components come together.

Let’s get started.

The Kubernetes architecture

At the end of the day, our goal is to get a containerized application running. If you’re planning to use the TKG Service for vSphere, you’re likely already familiar with what a container is—and, by extension—a container runtime. Kubernetes adds a topology and combines features to make it one of the best-suited container schedulers available.

In the context of Kubernetes, a container is not the lowest-level object. Rather, it’s the pod. A pod can have one or more containers within it. An application can be spread out amongst multiple containers, pods, and virtual machines (VMs). Kubernetes provides the flexibility to architect the application that best fits your environment and needs.

A pod has to run somewhere, which is where the first part of the Kubernetes infrastructure layer comes into play. This is the Kubernetes worker or node. The Kubernetes infrastructure is made up of only a few pieces, but just like vSphere ESXi hosts that are responsible for running virtualized applications, the Kubernetes worker is responsible for running containerized applications.

The Kubernetes worker can run multiple pods; the size of each pod and amount that can run on a worker are going to depend on the size of the worker. This is analogous to ESXi hosts and virtual machine capacity.

Next is the Kubernetes controller. Like vCenter, this is the brain of the Kubernetes deployment. It is packed with all the services required to keep the cluster functioning, along with components for deploying applications to its worker nodes.

The API server is the central communication hub. It provides REST-based services for the components to use to talk to one another, as well as user interaction when deploying applications. The API server also acts as the front end of the cluster by exposing the Kubernetes API. Internal components, such as the scheduler or nodes, and external components, such as kubectl or API-driven systems, make calls to the API server.

The Kubernetes controller-manager is a service that watches the shared state of the cluster through the API server and makes changes in an attempt to move the current state towards the desired state. Each controller provides a control loop that evaluates the current state of the system, compares the current state to the desired state, and takes action to reconcile the differences.

The scheduler is what will watch for new pods as they are requested and created.

Etcd is like pretty much the database. It saves the current state of the cluster. The control plane of Kubernetes can scale as well. There are more complex configurations that need to take place, such configuring a load balancer, but etcd will replicate changes across the controller nodes to create a highly available solution.

The amount of worker nodes needed is based on the resources needed to run the applications. This can scale as needed.

And all of this combined represents a single Kubernetes cluster. At the infrastructure level, these are all VMs running on vSphere.

There are lots of blogs, articles, GitHub repos, and tools available,  from the lowest level—installing individual services on Linux and creating TLS certificates—all the way up to the highest level, where everything is deployed automatically. But while there have been multiple attempts made at delivering a single Kubernetes installer experience, they are either tailored to a specific infrastructure provider, were developed before Kubernetes began to mature, or are proprietary.

The Cluster API architecture 

Cluster API is a tool that was developed in the Kubernetes open source community under the special interest group, Project Lifecycle. That means it’s an accepted tool by the upstream and larger Kubernetes community for building and deploying Kubernetes clusters in an automated way. It adheres to the same Kubernetes principles of using the declarative control loop to achieve a desired state and perform all functions of create, scale, upgrade, and delete.

At a high level, this is how Cluster API works. As a user, I define a cluster specification. Within this specification, I define the types of machines that will make up my cluster. Using the standard kubectl command line, I apply this to a Kubernetes cluster that has the Cluster API components installed. The Kubernetes cluster with Cluster API is considered my management cluster. This cluster is going to be responsible for communicating to the infrastructure provider of my choice and deploying a Kubernetes cluster based on the specification. Cluster API will use the declarative nature of Kubernetes to make sure it achieves my desired state. If at a later time I want to add more Kubernetes workers or masters, I simply edit the specification and apply it to the management cluster.

Cluster API has many different providers available, and there is a Cluster API provider for vSphere that allows it to know how to communicate to vCenter and deploy virtual machines based on templates or a subscribed content library.

The vSphere with Kubernetes environment

vSphere with Kubernetes takes all of this to another level by introducing an architecture that provides even more features and unique capabilities. Within vSphere there is the concept of the Supervisor Cluster. This functions as the management cluster for Cluster API. It’s represented as multiple VMs  that are automatically created when enabling the workload platform service. The Supervisor Cluster is responsible for deploying Kubernetes clusters as part of the TKG Service.

vSphere with Kubernetes features the vSphere namespaces concept. Like Kubernetes or Linux namespaces, it defines a boundary or security context. A vSphere namespace is like a resource pool, but it can run multiple types of Kubernetes objects such as VMs, vSphere pods, and multiple TKG clusters. The namespaces have role-based access control, which is inherited through vSphere single sign-on; there are also resource and object limits. These limits give administrators control over the namespaces so they can make sure applications do not take up more resources than allowed.

Within vCenter, after the workload service has been enabled on a cluster the Supervisor Cluster VMs are deployed. These VMs will represent the Kubernetes and Cluster API management cluster. Each has its own IP address, but through a leader election process, only one will be the interface needed to interact with a TKG Service cluster.

This example manifest is for a TKG Service cluster. Stay tuned for an upcoming blog post that examines this line by line, but in the meantime, it’s easy to see the cluster specification needed to represent a new Kubernetes cluster. 

After this specification is applied to the Supervisor Cluster, it will invoke the Cluster API components to achieve a desired state. The Kubernetes controller and worker nodes are represented as VMs within the namespace that was defined in the specification. 

Want to see it all together? Check out the video.

This high-level overview was meant as an introduction to the TKG Service architecture for beginners. Upcoming blog posts will walk through how to deploy a TKG cluster and then how to scale it. For more information, check out the Tanzu Kubernetes Grid site

About the Author

Kendrick Coleman is a reformed sysadmin and virtualization junkie. His attention has shifted from hypervisors to cloud-native platforms focused on containers. In his role as an Open Source Technical Product Manager, he figures out new and interesting ways to run open source cloud native infrastructure tools with VMware products. He's involved with the Kubernetes SIG community and frequently blogs about all the things he's learning. He has been a speaker at DockerCon, OpenSource Summit, ContainerCon, CloudNativeCon, and many more. His free time is spent sharing bourbon industry knowledge hosting the Bourbon Pursuit Podcast.

More Content by Kendrick Coleman
Dissecting a Tanzu Kubernetes Cluster Spec with the TKG Service for vSphere
Dissecting a Tanzu Kubernetes Cluster Spec with the TKG Service for vSphere

In this post, we will focus on deploying a Tanzu Kubernetes Grid Service cluster using a simple, customized...

Learn Design Principles and Components of the Kubernetes Machine on KubeAcademy
Learn Design Principles and Components of the Kubernetes Machine on KubeAcademy

VMware's new KubeAcademy course, The Kubernetes Machine, will teach you how Kubernetes works and what makes...

How To Think Cloud Native

Learn more