KubeAcademy by VMware
Building Images with Buildpacks: The Cloud Native Buildpacks Project
Next Lesson

In this lesson we learn about the Cloud Native Buildpacks project, which provides a simple though sophisticated approach for building images in a way that can be easily scaled and operationalized. We cover the motivations for the project, its API-based approach, and some of the implementations available in the ecosystem.

Cora Iberkleid

Part Developer Advocate, part Advisory Solutions Engineer at VMware

Cora Iberkleid is part Developer Advocate, part Advisory Solutions Engineer at VMware, helping developers and enterprises navigate and adopt modern practices and technologies including Spring, Cloud Foundry, Kubernetes, VMware Tanzu, and modern CI/CD.

View Profile

In this lesson, we'll introduce the Cloud Native Buildpacks project. We'll talk about the evolution of buildpacks and the motivation behind the project. We'll explain at a high level how it works and some of the choices in the ecosystem for using it. Buildpacks were first introduced by Heroku in 2011. They're simply components that prepare an application for launch in the cloud. Cloud Foundry embraced the concept of buildpacks early on and over the following years, both Heroku and Cloud Foundry focused on providing buildpacks to support popular programming languages. The community also saw broad adoption, as an example, there are nearly 6,500 build packs listed on the Heroku marketplace alone. The developer experience with buildpacks is to simply push source code or an application artifact, and the cloud platform takes care of the rest.

The platform detects which build pack to use and provides a staging environment. The build pack produces a staged application in the form of a tarball containing the compiled source and any dependencies. The platform launches the staged application, and you've got your application running in the cloud. This model is very easy to use and it provides enterprises the consistency, governance, security, and manageability that they need to operate at scale. But this design also has some limitations because it's tightly coupled and it lacks flexibility. By and large, if you're not using Heroku or Cloud Foundry, you can't use buildpacks or the staged application that they create. Buildpacks are also difficult to modularize because they play both the role of orchestrator of the overall build process and worker providing the language specific dependencies. Now, this design predates Docker, the open container initiative and Kubernetes. It made adjustments over the years, but in order to really bring it strengths to the modern container ecosystem and enable more advanced features, it needed to be redesigned.

So Heroku and Pivotal now VMware, and the original incubator of Cloud Foundry jointly established the Cloud Native Buildpacks project in 2018 for this purpose. To alleviate the problems of coupling and flexibility, the project provides a well-defined contract between platforms and buildpacks. It includes a platform API so that new platforms can be created each with a freedom to offer a unique user experience. It also separates the orchestration of the build process from the buildpacks themselves and provides a buildpack API that enables the creation and composability of modularized buildpacks. In addition, the end result of a build is an OCI image, which means it can be stored in any image registry and run on any container runtime system. This makes it possible to implement new capabilities and optimizations. We'll talk more about this shortly.

So a platform is a tool that you use as an end user or as part of your build pipeline in order to take advantage of buildpacks. Buildpacks are components that provide runtime support for applications. They enable us to add content or behavior to an image in a way that we can easily operationalize and scale. The interaction between the two, which would be teased out by implementing both APIs, constitutes the generic operations that any platform needs to carry out in order to leverage an arbitrary set of buildpacks. Because it is common behavior that all platforms can share, the Cloud Native Buildpacks project makes it easy by providing a reference implementation. This implementation is called the lifecycle and it comprises a set of well-defined phases. Platforms only need to orchestrate the life cycle rather than implement their own.

Let's walk through the phases for the first build of an application. The detect phase determines which buildpacks to run and in what order. For example, for a Java application, it may identify a JVM buildpack, a [inaudible 00:04:12] buildpack, and an Apache Tomcat buildpack. The build phase runs the buildpacks, each buildpack prepares layers of assets to add to the final image. Each layer includes metadata about the contents, such as the JVM version, for example. Now, the lifecycle is executing on a build image so much like in the multi-stage Docker file example that we saw earlier in this course, the export phase copies the layers from the build image to a slimmer run image. If the platform provides storage, then the export can also cache layers to optimize subsequent builds.

One interesting feature is that for each layer, a buildpack can specify whether or not to export and cache it. For example, a JVM buildpack may create JDK and JRE layers cache both of them, but only export the JRE to the run image. On a rebuild, the detect phase once again determines the set of ordered buildpacks to run. The analyze phase retrieves metadata about image layers and cache layers, so that buildpacks can decide which layers need to be recreated. The fact that each build pack can make decisions independently about reusing layers means that cache is not as sensitive to order as it is with Docker file and Gibb. If cache is available, the restore phase makes it accessible by loading it into the build image file system. The build phase, once again, executes all of the buildpacks, but the rebuild is optimized because it has the local cash, as well as the necessary information to minimize the number of layers to recreate.

The export pushes only new layers to the registry, along with a new manifest that can point to unchanged layers from the previous image as well as the new layers. It also updates the cache as necessary. The lifecycle also provides the rebase phase that can be invoked independently to swap out the runtime base image. This is intended for cases where an operating system vulnerability is discovered and a patch is made available in the form of a new run image. The rebase is invoked and a new application image is created that points to the digests of the run image layers that changed. There's no need to recreate or copy the other layers. This operation can complete within milliseconds on a registry and the new base image is guaranteed to be compatible by compliance with what's called an application binary interface. So using the right platform, this rebase can be easily applied to hundreds or thousands of images in a matter of seconds, which makes it a powerful security tool.

Let's recap by way of an example, using a CLI called pack as our platform to build an image. We use the command pack build and provide a name for our image. We need to provide the path to our source code, a build image, which is comparable to the front statement in the first stage of the multi-stage build that we saw in our Docker file examples. The build image should have the lifecycle installed, then a run image, which is comparable to the front statement in the second stage from our Docker file examples, and then the list of build packs to apply. You can see that the build and run images as well as the build packs are all provided by the build packs implementation, in this case, an implementation called paketo. Since we know the lifecycle can auto detect which buildpacks to use, it makes sense to bundle the buildpacks into the build image as well, which is exactly how buildpacks are distributed.

The build image serves as a builder with all of the components and intelligence needed for the build including a reference to the proper run image to use. Now that's still a lot to type, so you can set your builder of choice as the default and CD into your source code directory, which brings us to the very simple command pack, build, and the image name. And this will work for applications written in Java, Node, Ruby, .NET, and more. Very simple and extremely powerful. So we understand the building blocks that we need to build images with Cloud Native Buildpacks. In addition to the pack, CLI there's a growing ecosystem of platforms. Pack is in fact, the reference implementation from the Cloud Native Buildpacks project, and it optimizes for the developer experience based on imperative commands. Another example is Kpack, which is a service that can be hosted in a Kubernetes cluster and configured declaratively and Spring Boot offers Cloud Native Buildpacks support and provides integrations with Maven and Gretel workflows, and so on.

You can also mix and match as long as you use the same base images and build pack versions. Two different platforms will create the same image from the same source. Now, the cloud native buildpacks project doesn't provide reference implementations for builders or buildpacks because it only seeks to be opinionated about how to orchestrate a build not about how to run an application. However, Heroku and Cloud Foundry both provide versions of their buildpacks including base and run images that are compatible with Cloud Native Buildpacks. Paketo is in fact, the evolution of cloud Foundry build packs. VMware provides a commercial super set of paketo that includes buildpacks for integrations with security scanners, application performance monitoring tools, and so on, and this ecosystem is growing as well.

In the next lesson, we'll go hands-on with cloud native buildpacks using pack Kpack, Spring Boot, and Paketo buildpacks. in this lesson, we learned about the motivation for, and the design of the Cloud Native Buildpacks project. We reviewed how a platform uses the life cycle to apply build packs to source code and produce an OCI image. We also saw some of the optimizations that cloud native buildpacks brings to the table, the ability to safely and efficiently patch the operating system, and some of the choices in the ecosystem for both buildpacks and platforms.

Give Feedback

Help us improve by sharing your thoughts.

Share