When architecting the Kubernetes platform for your enterprise, among the fundamental questions you need to address first and foremost are “How many clusters will I need?” and “Do I need a few big clusters, or many smaller ones?”
The need for a distributed multi-cluster architecture
Kubernetes itself offers a multi-tenancy model, which uses namespaces to separate tenants and workloads. It is not uncommon for some enterprises to have just a few big clusters and use namespaces to isolate workloads of different purposes; for example, for dev, test, or production, or for different lines of business. This model was especially popular in the early days when for many enterprises Kubernetes cluster creation was a big undertaking.
With the fast growth of Kubernetes technology and adoption, cluster creation is much simpler today. You can easily spin up new clusters via Kubernetes services offered by major public cloud providers. Some commercial Kubernetes distributions, such as VMware Tanzu Kubernetes Grid, also enable the easy creation of multiple clusters in a single deployment, either on premises or in clouds.
Today, there is a growing desire for a distributed multi-cluster Kubernetes architecture with smaller, more dedicated clusters—per application, per environment, per team, or per SLA—due to some of the obvious benefits this architecture can bring:
- Better isolation. Applications deployed in namespaces on one cluster share the same hardware, network, and operating system, as well as certain cluster-wide services such as API server, controller manager, scheduler, and DNS. Such a soft multi-tenancy model causes concern around potential security and performance issues. Thus, using clusters as the isolation boundary is the preferred way to provide hard multi-tenancy, using the underlying hypervisor to isolate workloads much more effectively.
- Higher availability. Another benefit that multi-cluster architecture brings is the reduction of blast radius. Cluster issues, especially those that are related to shared services, won’t subsequently bring down all the applications running on the cluster. You can gain higher availability for your applications overall.
- Customized configurations. Different applications require different configurations; for example, some apps may need GPU worker nodes or a certain CNI plugin, or prefer a specific public or private cloud as the underlying IaaS. With clusters as the isolation boundary for applications, you can equip Kubernetes with the exact configuration that the apps need. You can also control the lifecycle of each cluster; for example, you won’t have to force all your applications to run on a newer version of Kubernetes if some of them aren’t ready yet.
- Edge deployments. Kubernetes deployment at the edge has emerged as a unique use case in recent years. Such a deployment model naturally requires a distributed multi-cluster architecture with centralized management.
Operational overhead calls for more efficient multi-cluster management
One big challenge of this architecture is the operational overhead, especially as the number of clusters starts to grow. Certain administrative tasks, such as access control and cluster upgrading, will often need to be conducted manually and repetitively on each and every cluster. For example, just to assign one simple role binding, cluster operators may have to open multiple cluster connections; they’ll then have to go back and redo everything when access needs to be updated. To make the management of Kubernetes more effective, automation is needed.
This is where Tanzu Mission Control comes in. Tanzu Mission Control is a centralized Kubernetes management platform for operating and securing Kubernetes clusters and applications across teams and clouds, be they public or private. Organizations can centralize all of their Kubernetes clusters using Tanzu Mission Control, either by provisioning clusters via the platform directly or by attaching existing clusters to it across any environment. Its unique design offers an automated way to manage all clusters effectively and efficiently.
Tanzu Mission Control helps tackle multi-cluster management challenges
Now let’s dig a bit deeper into some of the key capabilities and features of Tanzu Mission Control that support the efficient management of multiple Kubernetes clusters deployed in a distributed fashion. They include automated cluster lifecycle management, a unique resource hierarchy, a powerful policy engine, and a centralized administrative interface.
Automated Kubernetes lifecycle management
Cluster lifecycle management is a critical task for operations teams; the ability to automate tasks such as provisioning, upgrading, scaling, and deleting clusters through a centralized platform significantly increases operational efficiency. In recent years, the Kubernetes community has started to rally behind an open source project—Cluster API—as the specialized toolset to bring declarative, Kubernetes-style APIs to cluster creation, configuration, and management in the Kubernetes ecosystem. (Check out this blog post to learn more about Cluster API.)
Tanzu Mission Control leverages Cluster API to automate the lifecycle management of clusters provisioned via the platform (it currently supports AWS EC2) so the operators can easily control the lifecycle of their clusters by using its intuitive UI, API, and CLI. Operators can also choose the particular timing and the particular Kubernetes version to upgrade the cluster to, meeting the specific needs of any apps or teams.
Provisioning new clusters via Tanzu Mission Control UI
Choose the Kubernetes versions you want to upgrade to during the upgrading process
Unique resources hierarchy
Designed with the goal of offering a true multi-cluster management platform to both IT operation teams and application operation teams, Tanzu Mission Control adopts a resource hierarchy that reflects the needs of both. At the top level is the “organization,” into which customers get mapped. Within an organization, you can create multiple “cluster groups” to group various clusters together; for example, by teams or by environments (dev, test, or production). There are also “workspaces” with which you can group multiple namespaces together across clusters.
Tanzu Mission Control resource hierarchy
Cluster groups and workspaces logically separate the concerns of the infrastructure and application teams, which enables easier handoffs and transitions between them and avoids a ticket-based approach. IT and platform operations teams can manage cluster groups to implement cluster-wide configurations and define certain ground rules for things such as company-wide security and compliance, while application operations teams can work within workspaces to gain more granular control over their own applications.
The resource hierarchy separates IT and app teams’ concerns
Powerful policy engine
The resource hierarchy drives another important module of Tanzu Mission Control: the policy engine. Together, the resource hierarchy and the policy engine make it much easier for you to manage your clusters consistently and efficiently at scale.
By leveraging the policy engine, IT and platform operations teams can apply policies such as access directly to a cluster group to automatically implement them on all the clusters in that group. And when a new cluster gets created or added into the group, it will automatically inherit all the policies the group already carries. For example, if you create a new cluster in the production cluster group, you won’t have to manually apply any of the policies to it; instead, it will immediately inherit the policies from the cluster groups it belongs to.
The same goes for workspaces. By applying app-level policies like container registry and network to a workspace directly, application operations teams will be implementing those policies on all the namespaces in that workspace. New namespaces will also inherit the existing policies of the workspaces they belong to. You will even be able to set policies to the entire organization to implement them on all the resources in that organization.
The policy engine also allows for the addition of direct policies. Depending on the permissions a user has, they can make the policy more or less restrictive. This feature helps enable the flexibility and freedom of application teams while making sure any high-level IT rules remain in place for critical aspects of the operations, such as security- and compliance-related operations.
Apply policies to your clusters and workspaces at scale
Centralized administrative interface
Another aspect of Tanzu Mission Control that helps increase management efficiency is centralization. Having all clusters across teams and environments centralized in a single view enables greater observability than ever before, which leads to much faster monitoring, troubleshooting, and operational control.
View all your clusters in Tanzu Mission Control at a glance
Operators can quickly check the clusters’ core data—including resource utilization levels of memory and CPU usage, Kubernetes versions of the cluster, public cloud regions where the cluster resides, as well as the number of nodes, pods, and namespaces of any particular clusters—via the UI. The interface also visualizes the health of the clusters and their various components so that any issues can be easily identified. Its unique resource hierarchy and flexible label system also make it much easier for operators to sort and locate the right clusters to operate upon.
Check the metadata and health of your cluster and its components
As we have seen, a distributed multi-cluster architecture is becoming the dominant architecture of the Kubernetes platform for many enterprises. But in order to reap the benefits of such an architecture without drowning their operations teams, enterprises need a solution to help automate multi-cluster management with more efficiency at scale.
VMware Tanzu Mission Control, with its automated cluster lifecycle management, unique resource hierarchy, powerful policy engine, and centralized administrative interface for observability and operations, can help enterprises successfully tackle the challenge of multi-cluster management to provide a more robust Kubernetes platform with higher availability and security for all applications, across any clouds.
About the AuthorMore Content by Ning Ge