Afraid your Kubernetes clusters will go down? Now's the time to examine PKS

April 13, 2018 Ahilan Ponnusamy

Lots of enterprises are kicking the tires on Kubernetes. And for good reason! It’s common to hear questions like how does it work? How can I use it for my organization? What are the right use cases?

Questions quickly get much more tactical from there. There’s the usual curiosity about security and compliance. Operators wonder about patching and upgrades.

For users of the new Pivotal Container Service (PKS), you enjoy many of these features out of the box. It “just works” as we like to say.

In this post, we wanted to explore one aspect in detail - high availability. Hardware failures are a fact of life. So how do you keep your Kubernetes clusters online, as user traffic ebbs and flows? Let’s take a closer look!

PKS Provides Three Layers of HA for Kubernetes

Long-time Cloud Foundry operators know HA starts with BOSH, the deployment toolchain for distributed systems. In the world of Kubernetes, PKS uses BOSH to monitor and resurrect Kubernetes processes and underlying VMs. PKS provides three levels of HA out of the box:

VM failure management
Kubernetes process failure management
Pod failure management

Let’s take a look at how Kubernetes, on its own, manages the HA for deployed pods, the Kubernetes unit of deployment. The project uses an agent called kubelet that is deployed in the worker nodes.

Source: X-Team

As shown above, the kubelet deployed in the worker nodes monitors the pods. The kubelet also requests that the master node (the Kubernetes cluster control plane) re-create any pods that are unresponsive or destroyed. The master then proceeds to deploy the new pod in a healthy node, the node that happens to have the least amount of load. However, native Kubernetes HA stops here.

Out of the box, the project does not have the capability to monitor the kubelet agents themselves, or the VMs where the cluster is deployed. PKS - with the help of BOSH - addresses these HA scenarios. After all, most enterprises need to deliver aggressive SLAs at scale!

PKS includes the BOSH agent monit to look after your Kubernetes processes, and the underlying VMs themselves. PKS will restart the process or create a new VM in case the process (or VM) becomes unresponsive.

In a future release, BOSH will also deploy Kubernetes cluster across multiple AZs, a fourth layer of HA to your deployment . When you enable this upcoming feature, a single AZ failure won’t bring down your entire Kubernetes cluster.

Now, let’s review how PKS delivers HA for Kubernetes. We’ve seen a lot of interest in running Elasticsearch with PKS, so we’ll use that for our walkthrough.

Installing PKS and Creating a Kubernetes Cluster

Before we get to the HA features specifically, let’s review how fast (and easy) it is to get your Kubernetes cluster up and going.

Download Pivotal Container Service from Pivotal Network.
Login to Operations Manager, and import PKS.

Open the Harbor Registry tile, and start the PKS configuration.

In the PKS API tab, generate a certificate with a wild card domain name (*.mydomain.com) that you have registered with a DNS provider such as Google Domains.
In this example, we will use “plan 1” named “small.” Make sure to enable the use of privileged containers. This option helps make sure your pods have better access to the network stack and devices. (Learn more about this option and Privileged containers here).
Double-check that your Complete vSphere configuration is correct under “kubernetes cloud provider” tab. The datacenter, datastore and VM folder names should match your vSphere configuration.

In the UAA tab, enter a valid subdomain for the UAA URL field (pks.mydomain.com).

After completing the configuration, go back to installation dashboard. Click “Apply changes.” It can take up to 30 minutes for PKS to install. Upon successful installation, you’ll notice the PKS tile has changed to green from orange.

Map the PKS IP address (available under the “status” tab) to the UAA subdomain created in Step 4. This is shown below.

Download pks and kubectl CLI from Pivotal Network. If you’re on a Windows machine, rename the executables to pks.exe and kubectl.exe and add them to “path” environment variable.
Let’s now create a PKS admin user in UAA:
- SSH into OpsMgr. (You can also install cf-uaac in your local machine.)

$ ssh ubuntu@<<OpsMgr Fully Qualified Domain Name / IP>>

Target uaac to the environment, and create the admin user.

$ uaac target https://pks.mydomain.com:8443 --skip-ssl-validation

$ uaac token client get admin -s <<password>> (Available in Ops Mgr PKS tile –> Credentials tab -> Uaa Admin Secret

$ uaac user add john --emails john@mydomain.com -p <<password>>

$ uaac member add pks.clusters.admin john

Now, we’re ready to create a Kubernetes cluster! Typically, the PKS cluster is created by the operations team, before handing the environment over to the development team. In our example, John (as the administrator) will create a PKS Cluster called es_cluster (Elasticsearch cluster). The cluster will have 2 nodes (VMs) based on the sizing defined in plan 1, our small plan from earlier. We use these commands:

$ pks login -a pks.mydomain.com -u john -p <<password>> --skip-ssl-verification

$ pks create-cluster es_cluster --external-hostname es_cluster.mydomain.com --plan small --num_nodes 2

You can see the cluster creation process is “In Progress”.

Let’s check the cluster status:

$ pks cluster es_cluster

After a few minutes, we notice that the cluster creation has succeeded, and a Kubernetes master IP is assigned to the cluster.

Map the Kubernetes master IP to es_cluster subdomain in your DNS provider (es_cluster.mydomain.com).

The new PKS cluster es_cluster is now ready for use. John, the administrator, will now send the config file with the cluster configuration and authentication information to the engineering teams. He does this with the get-credentials command. He then sends the config file created under $user_home/.kube folder.

$ pks get-credentials es-cluster

Deploying the Elasticsearch Pod

Let’s now step through the developer interaction with PKS using the kubectl CLI. We can open a new terminal, and get the details of the two worker nodes the admin created. In real world, developers will place the config file under $user_home/.kube folder.)

$ kubectl get nodes -o wide

Download the Elasticsearch project from github.

We can create Storage Class, Persistent Volume Claim, and NodePort for the Elasticsearch pod.

$ kubectl create -f storage-class-vsphere.yml

$ kubectl create -f persistent-volume-claim-es.yml

$ kubectl create -f es-svc.yml

NOTE : Instead of using NodePort, you may also use port forwarding to map a local port (e.g. 2000) to ES port 9200. This way, you can restrict pod access within the Kubernetes cluster. You can find more info about port forwarding here.

Deploy the Elasticsearch pod:

$ kubectl create es-deployment.yml

Get all information about the Elasticsearch deployment with the following commands. (You can find the IP address of the Worker Node, the one that hosts the Elasticsearch pod, from the Node column in get pods call output.)

$ kubectl get pods -o wide

$ kubectl get nodes -o wide

$ kubectl get svc

All requests to create and access data from Elasticsearch are available as a postman file in the github project you downloaded earlier. Import the json file (ES_Requests.postman_collection.json) in Postman. (Download Postman. Prefer curl? Check out this repo.)

Open the CreateIndex request. Update the IP address and the Port number of the request URL to map our environment. We’re interested in these two Information:
- IP address - the Worker Node IP where the pod is deployed.
- Port - NodePort the 5 digit port from get svc command output

Execute the REST call. Click the Send button, and the Index will be successfully created, as shown below.

Similarly, open CreateOrderType request and execute it after updating the URL to reflect your environment.

Create two customer records by executing CreateCustomer and CreateCustomer2 requests.
Get a Customer record to validate the data by executing the GetCustomer1 record.

Now we have our cluster, and Elasticsearch installed. Let’s now review the HA features inherent to PKS!

Now, the Good Stuff: High Availability in PKS

When customers ask about HA Kubernetes, they want to know how the system stays online even when underlying resources fail. So let’s focus on the following scenarios in PKS:

Pod failure management (done by Kubernetes)
VM failure management (done by BOSH)
Pod and VM failure management (done by Kubernetes and BOSH)

Pod Failure Management by Kubernetes

Just for fun, let’s delete an Elasticsearch pod, and see how Kubernetes recreates the pod in the second VM.

Get pod information:

$ kubectl get pods -o wide

Delete the pod:

$ kubectl delete pod <<pod name>>

Let’s watch Kubernetes recreate the pod in the second VM. You’ll notice the pod being terminated from the first VM. A new pod is created in the next VM.

$ kubectl get pods -o wide

Execute get pods after few minutes to confirm the successful creation of the pod. You can also use watch flag (-w) to monitor the creation process, without having to re-execute the command after a few minutes.
Execute the GetCustomer1 request in Postman to validate that the pod is deployed successfully and the data is persisted.

OK, on to our second scenario!

VM Failure Management by BOSH

Now, let’s shutdown a VM from the PKS cluster, and watch BOSH automatically create a new one to replace it.

Find the non-pod resident VM (i.e. the VM that does not have the pod deployed). We type these commands:

$ kubectl get pods -o wide

$ kubectl get nodes -o wide

Shut down the VM. Click on the Red Square in the toolbar, or select the Power Off option from right click menu. Check out a video of this sequence:

Now, watch BOSH recreate the new VM. BOSH will make sure the desired state of the Kubernetes cluster - 2 worker nodes- is met. We simply execute the command:

$ kubectl get nodes -o wide -w

On to scenario 3!

Pod and VM Failure Management by Kubernetes and BOSH

Let’s shutdown the VM where the pod is deployed. Here, both Kubernetes and BOSH fix the failure. In this scenario:

Kubernetes will create the pod in the second worker node
BOSH will create a new VM to replace the Shutdown VM

Find the VM where the pod is deployed:

$ kubectl get pods -o wide

$ kubectl get nodes -o wide

2. Login to the vSphere web client, and find the VM using the IP address in the search box:

3. Shut down the VM. Click on the Red Square in the toolbar.

4. Execute the following commands in two different terminals. Watch BOSH recreate the new VM, and Kubernetes deploy the pod in the second VM:

$ kubectl get nodes -o wide -w

$ kubectl get pods -o wide -w

Apart from these three HA scenarios, PKS will also monitor Kubernetes processes like kubelet, and bring up the failed process as needed. PKS also communicates the failure to the admin through the configured notification channel (email, pager, and so on). You can use the BOSH -e command to check the processes that are monitored in master and worker nodes.

High Availability: Just One of Many Day 2 Features PKS Delivers Out of the Box

Customers turn to Pivotal to deliver availability for their most important systems. We’ve just demonstrated how PKS keeps your Kubernetes clusters stay online even when the underlying resources fail. This same innovation keeps you online during patching, upgrades, and when performing blue-green deployments. Want to try PKS? Download it, and then check out the documentation!

About the Author

Ahilan Ponnusamy is a Senior Platform Architect at Pivotal. He works on Partner Solution Architecture team supporting DellTech and VMWare sales teams. Prior to joining Pivotal, Ahilan was with Oracle Technologies leading the SMB Cloud Platform Specialists team supporting North America sales team.

A Unifying Foundation for the Customer Journey at Mercedes-Benz

Find out how the luxury automaker is using modern software development and a cloud-native platform to bette...

Developing a Custom Concourse Resource

This post provides a quick look at how to create your own custom Concourse resource.

Afraid your Kubernetes clusters will go down? Now's the time to examine PKS

PKS Provides Three Layers of HA for Kubernetes

Installing PKS and Creating a Kubernetes Cluster

Deploying the Elasticsearch Pod

Now, the Good Stuff: High Availability in PKS

Pod Failure Management by Kubernetes

VM Failure Management by BOSH

Pod and VM Failure Management by Kubernetes and BOSH

High Availability: Just One of Many Day 2 Features PKS Delivers Out of the Box

About the Author

Previous

Next

Afraid your Kubernetes clusters will go down? Now's the time to examine PKS

PKS Provides Three Layers of HA for Kubernetes

Installing PKS and Creating a Kubernetes Cluster

Deploying the Elasticsearch Pod

Now, the Good Stuff: High Availability in PKS

Pod Failure Management by Kubernetes

VM Failure Management by BOSH

Pod and VM Failure Management by Kubernetes and BOSH

High Availability: Just One of Many Day 2 Features PKS Delivers Out of the Box

About the Author

Previous

Next

Related content in this Stream

How VMware Tanzu CloudHealth helps customers uncover spiraling AWS Extended Support charges.

VMware Tanzu enhances Spring development with simplified operations, accelerated innovation, seamless microservices transition, increased security, and effortless scaling.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

Bitnami-packaged open source software is loved by developers for its ease of use, which enables developers to directly pull a Bitnami package and seamlessly start using it with little effort.

VMware Tanzu announces the General Availability of AWS Commitment Discount Recommendations, which provides recommendations for all reservable services in AWS through VMware Tanzu CloudHealth.

Introducing VMWare Tanzu Data Hub, a self-managed Database as a Service (DBaaS) Platform, providing enterprises a way to host their internal DBaaS offering for internal business users.

In the cloud-native landscape, MCAs drive seamless compliance integration. Their expertise ensures proactive security measures align with regulatory standards for sustained innovation & collaboration.

Tanzu Application Platform brings innovation faster with more frequent feature updates. With 1.9, take advantage of enhanced DORA metrics visibility and improved compliance options for companies.

We’re excited to share some great news! Spring Academy Pro content is now free. It will be available to everyone who registers a work, vocational, or educational email address.

March 28, 2024, marks the official minor release date of Spring Cloud Gateway for K8s version 2.2, and it's set to optimize how developers protect access to their GraphQL services.

We are excited to announce that VMware Tanzu Application Service 6.0 is now generally available!

Get a clear picture of your OSS supply chain, and the risks you face from your open source software dependencies, using the all-new Tanzu OSS Health Assessment.

Trivy can now utilize CSAF VEX data to filter out false positives in CVE reports, maximizing the value of VEX documents in VMware Tanzu Application Catalog.

Bitnami-packaged open source software container images available in DockerHub are now signed by Notation, an implementation of the Notary Project specifications and a CNCF-incubating project.

There’s never been a better time to be a Java and Spring developer! Let me show you why with a sneak peak into JD Conference 2024.