Exploring kube-apiserver load balancers for on-premises Kubernetes clusters

When you bootstrap a Kubernetes cluster in a non-cloud environment, one of the first hurdles to overcome is how to provision the kube-apiserver load balancer. If you are running a non-HA single node control plane cluster, the load balancer is unnecessary because all API requests are directly routed to the single control plane. In highly available configurations, a load balancer must sit in front of the kube-apiservers to correctly route requests to healthy API servers.

With cloud platforms such as AWS, it’s trivial to click a few buttons and launch an elastic load balancer. Outside of these platforms, the solution is less obvious. There are, however, load balancing options that can be deployed in non-cloud environments.

First, let’s review why the kube-apiserver load balancer is necessary.

As seen above, the load balancer routes traffic to the kube-apiservers. If a kube-apiserver goes down, the load balancer routes traffic around this failure.

Worker nodes communicate with the control plane through a single API endpoint. Using a load balancer for the endpoint ensures that API requests are properly distributed to a healthy kube-apiserver. If there were no load balancer in place, each worker would need to choose a specific kube-apiserver to communicate with. If this kube-apiserver were to fail, it would cause cascading failures to the bound worker nodes, which is the opposite of high availability.

In the rest of this blog post, we’ll discuss several options for implementing a kube-apiserver load balancer for an on-premises cluster, including an option for those running Kubernetes on VMware vSphere.

DNS for Load Balancing

A common scenario is to use round-robin DNS as a load balancer. This method carries several disadvantages. The lack of health checks prevents routing around failed servers. Unpredictable caching in the DNS hierarchy, as well as client-side, make management and updates difficult. Because of these drawbacks, there are better options to explore.

Option One: Standalone HAProxy

HAProxy is a quick and easy option to use. After installing the package on your load balancing server, you configure the list of kube-apiservers along with their health checks. Here’s an example configuration that balances three kube-apiservers, 10.10.10.10, 10.10.10.11 and 10.10.10.12.

global
   log /dev/loglocal0
   log /dev/loglocal1 notice
   chroot /var/lib/haproxy
   stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
   stats timeout 30s
   user haproxy
   group haproxy
   daemon

   # Default SSL material locations
   ca-base /etc/ssl/certs
   crt-base /etc/ssl/private

   # Default ciphers to use on SSL-enabled listening sockets.
   # For more information, see ciphers(1SSL). This list is from:
   #  https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
   # An alternative list with additional directives can be obtained from
   #  https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=haproxy
   ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
   ssl-default-bind-options no-sslv3

defaults
   logglobal
   modehttp
   optionhttplog
   optiondontlognull
       timeout connect 5000
       timeout client  50000
       timeout server  50000
   errorfile 400 /etc/haproxy/errors/400.http
   errorfile 403 /etc/haproxy/errors/403.http
   errorfile 408 /etc/haproxy/errors/408.http
   errorfile 500 /etc/haproxy/errors/500.http
   errorfile 502 /etc/haproxy/errors/502.http
   errorfile 503 /etc/haproxy/errors/503.http
   errorfile 504 /etc/haproxy/errors/504.http

frontend k8s-api
   bind 0.0.0.0:6443
   mode tcp
   option tcplog
   default_backend k8s-api

backend k8s-api
   mode tcp
   option tcplog
   option tcp-check
   balance roundrobin
   default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100

       server apiserver1 10.10.10.10:6443 check
       server apiserver2 10.10.10.11:6443 check
       server apiserver3 10.10.10.12:6443 check

A drawback of this option is that it makes the instance running HAProxy a single point of failure. The whole point of high availability is to remove the single point of failure, not introduce one. Other than quick lab clusters or clusters for proof of concepts, this approach is not recommended.

Option Two: The Keepalived Package with HAProxy

Keepalived is a powerful package that leverages the Linux kernel feature of floating IP addresses through Virtual Router Redundancy Protocol (VRRP). Two instances of HAProxy are launched, a primary instance and a standby instance. If the primary instance fails, Keepalived moves, or “floats,” the IP address to the standby, and no service disruption will occur.

Another interesting way to leverage VRRP is to run it inside a Docker container. These containers can be run as static pods paired with HAProxy, or even directly on the control plane nodes themselves. By using static pods, you can benefit from maintaining your load balancing solution with Kubernetes manifests, just like you do with your workloads.

The drawbacks with Keepalived are that you still need to choose and maintain the actual load balancing software. Keepalived provides only the floating IP address functionality.

Option Three: vSphere HA

Leveraging the powerful high availability features of VMware vSphere High Availability is yet another option. The HA setting can be enabled on the cluster level as shown below.

If a physical ESXi host fails, vSphere HA automatically restarts the HAProxy VM on a healthy ESXi host within the cluster, while maintaining the same IP address.

A great thing about the vSphere HA feature is that it works on the VM level so it’s agnostic to what software inside. There is also no need for specialized configuration because vSphere handles it for us.

Note that you must use a shared datastore for VMs to successfully float between hosts. VMware vSAN is a great choice here, but external options such as iSCSI will also work.

A downside is the speed of the operation. If the HAProxy VM is running on a host that experiences an immediate failure, there will be a delay due to the time needed for the failure detection and VM boot time on a healthy host. The HAProxy endpoint will be non-responsive until the operation completes.

It’s possible to prevent the downtime period by enabling vSphere Fault Tolerance (FT) on the HAProxy VM. In this case, a secondary “shadow” VM runs on a separate ESXi host. The VM is constantly replicated from the primary VM over the network. If the primary VM or its host fails, the IP address will instantly float to the secondary VM. In this scenario, no downtime will be observed during the failover process.

Bonus Option: VMware NSX-T

NSX-T is an extremely powerful, fully featured network virtualization platform that includes built-in load balancing. VMware Enterprise PKS leverages it out of the box, but it’s possible to install and configure it for VMware Essential PKS as well. An added benefit of using NSX-T load balancers is the ability to be deployed in server pools that distribute requests among multiple ESXi hosts. In this scenario, there would be no downtime if an individual host failed.

Conclusion

As shown above, there are multiple load balancing options for deploying a Kubernetes cluster on premises. Here’s a quick summary of each option’s main advantage:

For a quick POC, the simplicity of HAProxy can’t be beat.
For a highly available setup on bare metal, using HAProxy with Keepalived is a reliable option.
On vSphere, you can take advantage of vSphere HA and combine it with a shared datastore to run your load balancing VMs.
If NSX-T is available in your cluster, load balancers can easily be created with a click of a button.

DNS for Load Balancing

Option One: Standalone HAProxy

Option Two: The Keepalived Package with HAProxy

Option Three: vSphere HA

Bonus Option: VMware NSX-T

Conclusion

Related Articles

The Shadow PaaS vs CaaS War: Cloud Foundry's Relevance in a Kubernetes World

Gain Insights into the Risks You Face from Open Source Dependencies with VMware Tanzu OSS Health Assessment

Spring Cloud Gateway for Kubernetes 2.2: A Focus on Enhanced GraphQL API Support

Improving Kubernetes Operations One Step at a Time

2023 Product Highlights from Tanzu CloudHealth