What Happens When a Physical Node Fails in VMware vSphere with Tanzu

May 6, 2022 Kendrick Coleman

Picture this… You just got your second cup of coffee and you’re walking back to your desk. The phone in your pocket begins vibrating as a flurry of emails show services are bouncing. The notifications say services were down but are now back up. You hustle back to your desk, spill a little coffee on your notepad, and open the VMware vSphere client to see a host in a disconnected state. Something on the physical host failed but thankfully vSphere High Availability (HA) worked just as intended and all your Kubernetes clusters and containers are functional once again. We know that vSphere HA worked, but how does it function when using vSphere with Tanzu? 

Let’s break this down in a few different ways so we can see how the requirements for vSphere with Tanzu come into play. This will uncover some hidden affinity rules and how that interacts with vSphere functionality. Lastly, a feature of Cluster API plays a vital role for multi-layer availability. 

At minimum, vSphere with Tanzu requires a three-host vSphere cluster for setup and configuration. This satisfies the first rule of availability because the Supervisor Cluster (Kubernetes management cluster) is a three-node control plane. This is crucial to the integrity of the etcd cluster that maintains the state of the Kubernetes cluster that is deployed by default. Never put all your eggs (or, in the case of etcd, a majority of your members) in one basket. 

During the deployment process, a firm anti-affinity compute policy is applied to the Supervisor using vSphere ESX Agent Manager. This will ensure that the Supervisor control plane nodes will not run on the same host. After setup is complete, the hidden anti-affinity compute policy will utilize vMotion to migrate control plane nodes when necessary to make sure they are not on the same host. When a host needs to be placed in maintenance mode, vSphere will bend on the policy and allow two control plane nodes to exist on the same host, if necessary. 

What about the workload clusters? These same exact rules and policies are applied to workload clusters and their control plane nodes as well. Each cluster will have its own hidden anti-affinity compute policy created. 

If you are a vSAN customer, everything will work as expected out-of-the-box. When datastores other than vSAN are used, it is recommended to place the three control plane virtual machines on different datastores for availability reasons. 

The control plane nodes are separated for the integrity of etcd, but what about the worker nodes? It’s critical to keep the worker nodes separated on different hosts as much as possible because any Kubernetes deployment that uses multiple replicas will have containers spread across them. Just as before, the worker nodes also get their own hidden anti-affinity compute policy that will spread them amongst the hosts. 

Alright, so what happens when a physical node fails? As boring as it sounds, vSphere HA does exactly what it’s supposed to do. All virtual machines and control plane and worker nodes will be restarted on different hosts in the cluster and follow the anti-affinity compute policies. If the cluster just went from three to two nodes, it would bend the rule to make sure all services can come online. Once the host is remediated, the rules kick back into place and the virtual machines will use vMotion find their new home. 

But wait, there’s more! There is also another fail-safe happening, and that is Cluster API Machine Health Check. When a Kubernetes cluster is created, there are a series of health checks that are continually happening through a set of controllers. These provide the status of each node with information about its health. 

This comes into play in a few scenarios. This first is when someone accidentally powers off a node. If that ever happens, the node is no longer sending health information and the control plane starts a ten minute timer. After the ten minutes is complete, a "Power On" command is sent to vSphere and the machine will be turned on. 

The second scenario is when an ESXi host loses network connectivity but maintains connection through Datastore HA. Machine Health Check will remediate the cluster by creating a new virtual machine that will be added to the cluster and will also remediate any deployments. 

Rest easy knowing that vSphere and Kubernetes are working together to keep your applications running. Availability spans multiple levels to ensure production workloads stay in production. A special thanks goes to the Tanzu engineering team for sharing their insight that made this technology and blog post possible. 

If you are new to Kubernetes and want to try vSphere with Tanzu, go check out the hands-on lab HOL-2213-01-SDC in the VMware Hands-on Labs where you can begin deploying, managing, and working with Kubernetes clusters.

About the Author

Kendrick Coleman is a reformed sysadmin and virtualization junkie. His attention has shifted from hypervisors to cloud native platforms focused on containers. In his role as an Open Source Technical Product Manager, he figures out new and interesting ways to run open source cloud native infrastructure tools with VMware products. He's involved with the Kubernetes SIG community and frequently blogs about all the things he's learning. He has been a speaker at DockerCon, OpenSource Summit, ContainerCon, CloudNativeCon, and many more. His free time is spent sharing bourbon industry knowledge hosting the Bourbon Pursuit Podcast.

More Content by Kendrick Coleman
Kubernetes Is Here to Stay: This Is Why
Kubernetes Is Here to Stay: This Is Why

As more large organizations adopt Kubernetes, skills are improving while security concerns are growing. See...

Announcing Stronger Istio Support with Istio Mode for VMware Tanzu Service Mesh
Announcing Stronger Istio Support with Istio Mode for VMware Tanzu Service Mesh

With the new Istio Mode, operators can interact directly with the open source Istio deployment on their clu...