"Platform Engineering in the Wild with Tanzu Platform for Cloud Foundry," Jürgen Sußner (DATEV eG) and Nick Kuhn (Tanzu), August 2024.
This is what makes platform engineering new and interesting: "If I consider them [developers] as customers, I have to ask them, what should I do better? How can I do better? What can I do better to help you as my customer, as my developer to speed up your process and creating software and creating value for the company." What he’s talking about is adding product management to the way you build and operate platforms. That's Jürgen Sußner giving an overview of how his organization, DATEV has used the Tanzu Platform for over six years to run their applications and services. Now, in 2024, their platform is supporting over 2,500 developers and runs over 1,200 large projects, apps, services, and the like.
With that many years, developers, and apps, the platform engineering team at DATEV has learned a lot. I've talked with platform engineers like Jürgen a lot over the years, and what struck me as new and different was how much emphasis he put on developing and sustaining the internal community. From how Jürgen describes it, the community is a vital feature of the platform.
After hearing how DATEV and other platform engineering groups talk about their internal communities, I think we can say that this community building and gardening is an integral part of platform engineering. This seems obvious, but with many obvious sounding things, what's easy to miss is how "mission critical" the community is.
There's at least three things the "feature" of community does:
An internal platform community scales support and troubleshooting beyond the small platform engineering team
Platform teams are very small compared to the number of developers they support. For example, one of our customers supported 1,200 developers with six platform engineers, and a study by ESG of our customer base found that a ratio of 200 developers to one platform engineer is common. And yet, platform engineers eschew relying only on tickets to support developers.
Platform engineers, of course, rely on "full automation," as Jürgen puts it, but developers will still have plenty of questions and need help troubleshooting. The only way to scale that "feature" is to activate the platform's internal community.
An internal platform community serves as a living knowledge base and documentation.
As your community grows, developers will start helping each other troubleshoot, but also learn how to use the platform. Meanwhile, thorough documentation and notes on updates helps fuel this community and keeps the community smart.
Keeping that documentation and those update blogs going can seem like a low-value task to traditional ops thinking. And, it's a lot of work. As Jürgen says, "every question that gets asked at least twice or a third time should be somewhere in the documentation and it should be easy to find."
That's a lot of time spent on documentation! But, staying on-top of that knowledge base is what keeps your developer's velocity going. It helps developers spend less time on troubleshooting and learning-toil,l and more on application development.
Plus, as the platform grows, the community will start to help with documentation as well, helping you tackle that effort. This comes up in a similar talk from the Tanzu Platform team at Mercedes. "Developers can provide their documentation, or maybe enhance our documentation. Because we are the platform team," Thomas Müller said, "we cannot integrate everything and document everything in this detail. So, therefore, this was very helpful for us."
An internal platform community discovers and adds new features to the platform
While most of the new features in a platform come from product managing those features, as Jürgen says, some features will come in through the community. Jürgen cites one example of this at DATEV. The noisy neighbor diagnostic dashboard was created by developers in their internal platform community. This was valuable to everyone, so the platform team pulled it into the platform as a standard feature.
Product management is key to platform engineering, and you'll of course see most of the new features and improvements come from the platform team. But, with a thriving internal platform community, you can start getting new platform capabilities from the developers.
Now, Jürgen doesn't discuss this at all, but I have a theory that taking features from the internal platform community also helps stave off shadow platforms: those developers who get fed up with a lack of features in the platform and go off and write their own platforms. This is a bad idea, you're introducing bespoke stacks that will soon become infrastructure debt. But you can help steer developers away from that when you work with the community, product managing the platform and even incorporating contributions from developers into the official platform.
Build a Community to Scale Your Platform
The three community features above each contribute to scaling your platform. This is key to getting value out of the effort and expense of an internal application platform and the platform engineering team that runs it. Showing the value of platforms is difficult for numerous reasons, but one of the better ones is to go over how many operations people are needed to support all of the developers. As with open source communities, if you have a small, core team, you need to setup the community to take on a lot of the work. That's why community is a critical "feature" of platform engineering, especially in large enterprises like DATEV.
Platform teams need to treat the community like they would other features and services in their platform from databases, monitoring, to the basics of running cloud native applications. If one of those was floundering, there'd be big problems: the actual technology of the platform would be on the rocks, dashboards would go red, and all hands would be on deck. Your internal platform community is the same: without it, it'll be difficult to get the maximum value out of your internal developer platform.
You should check out the full talk. Jurgen goes over a lot more than just community, including platform automation, how they manage 23,000 container updates a week, and, if you wait until the end, some tips on getting your internal developer platform going despite internal skeptics.
About the Author
Follow on Twitter More Content by Michael Coté