Enhancing Kubernetes Security with OPA

February 3, 2020 Jamie Duncan

The security ecosystem for Kubernetes can be confusing. A Sysdig article from July 2019 outlined 33 security tools for Kubernetes. That number has only grown. The tools that help secure your Kubernetes cluster today can be sorted into three broad categories.

Tools like Clair and SonarQube scan code inside your container image for vulnerabilities. They report back their findings to help make your code more secure. Platforms like StackRox, Dynatrace, and Sysdig focus on securing your pipelines to ensure that the code you verified is what gets deployed to your environment. Finally, low-level tools like SELinux, AppArmor, and POSIX are leveraged by Kubernetes and your container runtime to prevent bad actors from getting a foothold inside your cluster.

Even with all of these available products, there’s a hole in this security model that we’ll discuss in this blog post. Even with the above tooling working together, security issues still arise if the wrong sort of workload is deployed into a sensitive area. For example, you wouldn’t want to allow additional load balancers deployed in your Kubernetes cluster because they could route traffic in unintended or unsafe ways. You also wouldn’t want unverified development code deployed into your production environment. To prevent these deployment-related security issues, you can create policies using a Kubernetes component called an Admission Controller.

Kubernetes Admission Controllers can analyze an API request to create objects in a cluster before they’re actually created. There are two kinds of Admission Controllers.

Mutating Admission Controllers take an incoming API request and make a prescribed change to it before deploying it in your Kubernetes cluster. These can be useful if you want to make universal changes to parts of a request. Common actions like setting default values for quotas, default Storage Classes, or even setting a Pod to always pull a new copy of the image are handled by Mutating Admission Controllers. A list of Mutating Admission Controllers that are enabled by default can be found in the Kubernetes documentation.
Validating Controllers don’t make changes to the API request, but they can reject a request if it’s against a policy used by the Admission Controller.

If either type of Admission Controller rejects an API request, the objects are never actually deployed, and the request reports back that it failed at that point in the workflow. Today we’re investigating a Validating Admission Controller that’s quickly gaining popularity in the Kubernetes community, based on Open Policy Agent (OPA).

Using Open Policy Agent

OPA (pronounced “oh-pa”) is an incubating project of the Cloud Native Computing Foundation (CNCF). From its documentation website, OPA is

“An open source, general-purpose policy engine that unifies policy enforcement across the stack. OPA provides a high-level declarative language that lets you specify policy as code and simple APIs to offload policy decision-making from your software. You can use OPA to enforce policies in microservices, Kubernetes, CI/CD pipelines, API gateways, and more.”

OPA is deployed in Kubernetes as a Validating Admission Controller. There’s a great tutorial on the OPA website to help you get it up and running in your cluster. Although the tutorial uses Minikube as the development platform, any functional Kubernetes cluster can be used. The tutorial also uses self-signed TLS certificates for communication between OPA and Kubernetes. If you want to use your own certificates, just supply them as the files referenced instead of generating the self-signed ones using OpenSSL.

When deployed in Kubernetes, OPA acts as a Validating Admission Controller.

Validating Requests with Rego

When a request comes into the API server, OPA validates it against a rule set written using Rego, a structured query language that can support JSON. Rego is based on formats like Datalog that have existed in the InfoSec and other communities for decades. The OPA tutorial page walks you through setting up OPA and configuring it to allow specific Ingress domains for specific namespaces. The policy created in the tutorial ensures traffic bound for one domain can’t be hijacked by creating another Ingress for the same domain but pointing it to a different service. Without a tool like OPA that could be a possible attack vector.

Policies written using Rego are how you’ll interact with OPA in your Kubernetes cluster. In the following sections, we’ll examine an OPA policy in depth. The code for this example comes from the OPA website.

Investigating the Intersection of Rego and OPA

OPA policies are loaded into OPA as a ConfigMap.

package kubernetes.admission

import data.kubernetes.namespaces

operations = {"CREATE", "UPDATE"}

deny[msg] {
	input.request.kind.kind == "Ingress"
	operations[input.request.operation]
	host := input.request.object.spec.rules[_].host
	not fqdn_matches_any(host, valid_ingress_hosts)
	msg := sprintf("invalid ingress host %q", [host])
}

valid_ingress_hosts = {host |
	whitelist := namespaces[input.request.namespace].metadata.annotations["ingress-whitelist"]
	hosts := split(whitelist, ",")
	host := hosts[_]
}

fqdn_matches_any(str, patterns) {
	fqdn_matches(str, patterns[_])
}

fqdn_matches(str, pattern) {
	pattern_parts := split(pattern, ".")
	pattern_parts[0] == "*"
	str_parts := split(str, ".")
	n_pattern_parts := count(pattern_parts)
	n_str_parts := count(str_parts)
	suffix := trim(pattern, "*.")
	endswith(str, suffix)
}

fqdn_matches(str, pattern) {
    not contains(pattern, "*")
    str == pattern
}

The first line, package kubernetes.admission defines a hierarchical name for the policies in the rest of the file file. The default location for policies in OPA is kubernetes.admission.
The import parameter, import data.kubernetes.namespaces provides a list of all current namespaces deployed in kubernetes. This data is collected by OPA when the pod is deployed and updated when the policy is activated.
operations = {"CREATE", "UPDATE"} defines the actions that will trigger the action. In this case, the policy is run when an API object is created or updated.

After this, OPA policies written with Rego can become a little counterintuitive until you’re accustomed to how they function.

Dissecting an OPA Policy Written with Rego

The most common pattern in Rego is to define a set of conditions to test. If the conditions are all met, the request is denied and the proper reason is presented back through the Kubernetes API server to the user who requested the action. These conditions are defined in the deny function.

deny[msg] {
	input.request.kind.kind == "Ingress"
	operations[input.request.operation]
	host := input.request.object.spec.rules[_].host
	not fqdn_matches_any(host, valid_ingress_hosts)
	msg := sprintf("invalid ingress host %q", [host])
}

Let’s look at these conditions.

input.request.kind.kind == "Ingress" tells OPA to only act on API requests that are creating Ingress objects.
operations[input.request.operation] confirms that the request type is in the operations variable. It will return true if the API request’s operation type is either UPDATE or CREATE. When added to the previous test, this policy acts only on Ingress objects when they’re created or updated.

host := input.request.object.spec.rules[_].host defines a variable named host with the data from the API requests .spec.rules.host value. The _ character is a special anonymous variable. Instead of having to explicitly name each variable, the _ character can be used to iterate quickly through a list of values. In this variable definition, the _ variable iterates through all the rules in the API request .spec.rules and tests each value for host against the policy conditions.

Clarifying the Final Test

The final test is where Rego can get a little confusing. The default method, and the one used in this tutorial, is to create a policy that will deny an API request. That means all of the conditions inside the deny policy we’ve been walking through have to evaluate as true. If all of the conditions inside the deny policy are met, then the request is denied by OPA.

The final test in the policy returns True if the domains in the API request are in the ingress-whitelist annotation for its namespace. But we want to deny requests if they aren’t in the ingress-whitelist annotation for the namespace. To accomplish this, the final tests uses the not operator. Even though fqdn_matches_any returns True if the domains should be allowed, the not operator tells the deny policy to look for the inverse of this result.

The code not fqdn_matches_any(host, valid_ingress_hosts) calls the fqdn_matches_any function defined in the policy. Additionally, it passes the valid_ingress_hosts parameter defined in the policy as well.

The value for valid_ingress_hosts is defined as follows:

valid_ingress_hosts = {host |
	whitelist := namespaces[input.request.namespace].metadata.annotations["ingress-whitelist"]
	hosts := split(whitelist, ",")
	host := hosts[_]
}

The curly brackets define valid_ingress_hosts as an array with keys named whitelist and host. The whitelist value is calculated by looking at the namespace annotations for the incoming API request. If the namespace has an annotation named ingress-whitelist, the associated hostname patterns for that annotation are saved as host values within the array.

In the tutorial, namespaces with an ingress-whitelist annotation are created to test the policy against.

apiVersion: v1
kind: Namespace
metadata:
  annotations:
    ingress-whitelist: "*.qa.example.com,*.internal.example.com"
  name: qa

In this namespace, valid_ingress_hosts would be calculated as follows:

{"host": "*.qa.example.com", "host": "*.internal.example.com"}

This array is passed into fqdn_matches_any.

fqdn_matches_any(str, patterns) {
	fqdn_matches(str, patterns[_])
}

This function calls two functions named fqdn_matches.

fqdn_matches(str, pattern) {
	pattern_parts := split(pattern, ".")
	pattern_parts[0] == "*"
	str_parts := split(str, ".")
	n_pattern_parts := count(pattern_parts)
	n_str_parts := count(str_parts)
	suffix := trim(pattern, "*.")
	endswith(str, suffix)
}

fqdn_matches(str, pattern) {
    not contains(pattern, "*")
    str == pattern
}

These are both functions to take a domain like *.qa.example.com

cleanly trim the *. from the front of the domain if present, and return true if the host variable matches one of the domains in valid_ingress_hosts.
Return true if there is no *. At the front of the domain and the host string matches one of the domains in valid_ingress_hosts.

Wrapping Up

Let’s summarize this OPA policy deny in plain language:

The policy is run when an incoming API request into Kubernetes creates or updates an Ingress object.
The host value for the incoming Ingress object is compared against valid ingress hosts maintained as an annotation on the namespace being acted on.
If the host value for the incoming request does not match the valid ingress domains for its namespace, the request is denied and “invalid ingress host ” is passed back to the user attempting to create the Ingress object. This programmatic logic can quickly test any Kubernetes API request, evaluating it against any data available to the OPA pod through the Kubernetes API or even external data sources. In addition to the functions used above, Rego has built-in libraries to send HTTP or HTTPS requests and evaluate the response data.

OPA policies and Rego are quickly gaining traction in the Kubernetes community because of this robust functionality and the ability to programmatically define policies governing the creation of any API object.

About the Author

Jamie Duncan is a recovering history major who has been working with and on Kubernetes for approximately 5 years. His primary focus has is centered around the operational aspects of Kubernetes, culminating with the May 2018 publication of OpenShift In Action by Manning Publications. Jamie focuses on the fundamental aspects and value of Kubenetes with customers, advocates, and technology fans on multiple continents. That fundamental knowledge of how containers work helps people treat containers like the revolutionary technology they are, using them strategically to solve their challenges. When not knee deep in Kubernetes, Jamie’s a hobby farmer and F1 racing fan.
More Content by Jamie Duncan

Introducing Watch-Proxy: A Beacon to Gather Kubernetes Info for IT Systems

When the systems outside Kubernetes need information about what happens to resources inside Kubernetes, Wat...

Learn How to Bootstrap Kubernetes Clusters on KubeAcademy

The Cluster Operations course is designed to help you learn how to bootstrap Kubernetes clusters using vari...

Enhancing Kubernetes Security with OPA

Using Open Policy Agent

Validating Requests with Rego

Investigating the Intersection of Rego and OPA

Dissecting an OPA Policy Written with Rego

Clarifying the Final Test

Wrapping Up

About the Author

Previous

Next

Enhancing Kubernetes Security with OPA

Using Open Policy Agent

Validating Requests with Rego

Investigating the Intersection of Rego and OPA

Dissecting an OPA Policy Written with Rego

Clarifying the Final Test

Wrapping Up

About the Author

Previous

Next

Related content in this Stream

Introducing VMWare Tanzu Data Hub, a self-managed Database as a Service (DBaaS) Platform, providing enterprises a way to host their internal DBaaS offering for internal business users.

In the cloud-native landscape, MCAs drive seamless compliance integration. Their expertise ensures proactive security measures align with regulatory standards for sustained innovation & collaboration.

Tanzu Application Platform brings innovation faster with more frequent feature updates. With 1.9, take advantage of enhanced DORA metrics visibility and improved compliance options for companies.

We’re excited to share some great news! Spring Academy Pro content is now free. It will be available to everyone who registers a work, vocational, or educational email address.

March 28, 2024, marks the official minor release date of Spring Cloud Gateway for K8s version 2.2, and it's set to optimize how developers protect access to their GraphQL services.

We are excited to announce that VMware Tanzu Application Service 6.0 is now generally available!

Get a clear picture of your OSS supply chain, and the risks you face from your open source software dependencies, using the all-new Tanzu OSS Health Assessment.

Trivy can now utilize CSAF VEX data to filter out false positives in CVE reports, maximizing the value of VEX documents in VMware Tanzu Application Catalog.

Bitnami-packaged open source software container images available in DockerHub are now signed by Notation, an implementation of the Notary Project specifications and a CNCF-incubating project.

There’s never been a better time to be a Java and Spring developer! Let me show you why with a sneak peak into JD Conference 2024.

If you're into FinOps, you've probably heard of FOCUS. Introducing our FOCUS FlexReports template for AWS, Azure, and GCP. Turn your cloud bills into FOCUS-compliant reports in minutes!

The latest Spring Boot simplifies infrastructure setup with Docker Compose. Now, supporting Bitnami images, it opens new possibilities for developers. Exciting times ahead!

Shape the future of Spring! Participate in the State of Spring Survey 2024. Share insights, collaborate with the community, and drive innovation.

Extend Apache Tomcat support with Tanzu Spring Runtime. Seamless transition, enhanced security, and uninterrupted workflow for Java applications.

Welcome to another edition of What’s new with Tanzu Application Catalog. This is a quarterly round up of all things related to Tanzu Application Catalog.

As we stand at the threshold of a new era in data management, Greenplum continues to lead the industry with its commitment to innovation.

Experience enhanced security with Tanzu Application Platform. Elevate your organization's defenses from code to build with SLSA Level 3, image scanning scheduling & automatic upgrades for new patches.

Explore Spring's exceptional NPS score of 75, surpassing industry benchmarks by 18%. Discover why it matters.