The 10 Platform Journey Health Markers: A Roadmap to Continuous Improvement

August 6, 2019 Parker Fleming

What bothers you the most about enterprise IT? For me, it’s the tendency to promise easy solutions to complex problems. It’s far too common, especially as you attempt to navigate the complex and confusing world of “DevOps” and “cloud-native.”

It is understandable behavior, though. For much of human history, our very survival depended on conserving energy and resources. And we avoided visible and perceived risks at all cost. 

Unfortunately, these instincts don’t serve us well when it comes to building and running distributed systems in our data centers or the public cloud. 

At Pivotal, we believe that transformation is something you DO, not something you BUY. This may seem counterintuitive coming from a software company, but bear with me. 

Pivotal Platform is just a tool. To maximize its value, you need to learn how to use it as it was intended. When companies wrap this modern platform in their legacy operational model, they struggle to realize the maximum potential of the product. We’ve learned there’s a better way to achieve superior outcomes.

Our  PCFS team teaches you how to treat your platform as a product. We pair with you and your team to apply User-Centered Design, Extreme Programming (XP), Lean,  and Site Reliability Engineering (SRE) practices. Without these practices, companies continue to be constrained by the same legacy governance, operational, security and change control processes.

So once you’ve made the move to Pivotal Platform, it’s fair to ask a few questions. Namely, how do you know you’re getting better at running your platform as a product? How do you know if you’re executing at the highest level?

That brings us to the topic at hand.

Over the last year, we’ve developed a framework to measure the relationship between the prescribed practices and the health of your Pivotal Platform and your Platform Team. This framework —the Pivotal Platform Journey Health Markers—consists of ten distinct areas where we have seen our most successful customers achieve a state of continuous improvement.

Over the coming months, we will publish a series of blog posts that dive deeper into each of these areas. We’ll also explore how customers like you have found a way to excel in each of them.

The Ten Platform Journey Health Markers

Here is a quick summary of each of the ten Health Markers. Is there one of these that stands out to you as the most important? Is there one you and your team are struggling with? Let us know!

1. Monitoring and Metrics

Establishing desired service behavior, measuring how the service is actually behaving, and correcting discrepancies.

Why it Matters

Observability of your platform state is required in order to ensure Service Level Objectives (SLOs) are being met. This imbues trust in the platform from your customers which over time, will afford the platform team increased autonomy in their platform operations. Monitoring and alerting on the right things also allow the platform team to react to situations before they impact the reliability of the platform.

2. Capacity Planning

Projecting future demand and ensuring that a service has enough capacity in appropriate locations to satisfy that demand.

Why it Matters

On-demand access to services and resources is the cornerstone of the value proposition of a platform. In order to ensure this expectation of your customers is met, proactive capacity planning is required in order to accommodate externalities that require lead time such as procurement, hardware, networking, limit management in the public cloud, and so on.

3. Platform Update Engine

Fast, frequent, frictionless delivery and measurement of incremental platform capabilities in production. 

Why it Matters

Keeping your platform current provides your users with a secure, supported, and feature-rich developer experience. By setting expectations for the rate and cadence of change of your platform (through the use of error budgets, vulnerability budgets, and legacy budgets), you can ensure you are able to provide all of these things to the business.

4. Emergency Response

Noticing and responding effectively to service failures in order to preserve the service's conformance to SLA.

Why it Matters

Your ability to react to adversity helps prevent the erosion of customer satisfaction, and having a healthy feedback loop allows you to learn from these incidents to avoid recurring issues. It also preserves your error budget to ensure you have time to focus on delivering new features and value to your customers.

5. Self Service

Instantiating and deleting service capacity in a predictable fashion, often as a consequence of capacity planning. 

Why it Matters

Shortening the time to market and empowering your business to support their applications without introducing additional friction are arguably the two most strategically valuable things your platform can do for your business.

6. Performance Optimization

Characterizing and tracking service component performance, efficiency, and resource utilization, to identify and address regressions as well as driving improvements in efficiency

Why it Matters

The performance of your platform capability is a key component of the reliability expectations of your customers. Monitoring of performance-related metrics can help you take corrective measures to prevent unnecessary use of your error budget. Additionally, these insights can help inform responsible capacity planning.

7. Business Continuity

Providing a secure, low-impact mechanism to meet recovery-time and availability objectives.

Why it Matters

Having a robust and well-rehearsed process to restore your platform to a known good state ensures your business can trust the platform capability to run business-critical applications.

8. Platform as a Product

The platform team frequently updates the platform with new features and security updates. The platform team introduces new capabilities in response to the needs of its users. It is treated as a product that includes not only Pivotal Platform, but all the services and integrations that make it a viable environment for applications to run.

Why it Matters

Focusing on the needs of your users not only creates great product/market fit (which delights your users!) but also prevents your platform team from wasting time building the wrong things or over-engineering solutions.

9. Balanced Team

The platform team consists of a product manager and at least two platform engineers with a combination of infrastructure and software engineering skills.

Why it Matters

Product management and platform engineering are complementary but distinct domains. Having individuals on your team who focus on each of these, ensures you have high-quality interactions and feedback loops with your customers, which in turn gives the engineers on your team clear direction on what to build.

10. Path to Production

Developers are able to take full advantage of the platform via modern and optimized tools and processes.

Why it Matters

When coupled with the ability to self-service tenancy, a streamlined path to production (that is unencumbered by legacy tools, gates, change windows, etc.) is the key to minimizing the time to value for your business. (Build pipelines apply here, as do continuous integration, and continuous delivery principles.) It also empowers application teams to troubleshoot and address issues without escalation.

Ready to Learn More?

The idea of “digital transformation” is squishy; the definition varies based on who you’re talking to. But if you’re serious about getting better at software, you can very quickly use these ten health markers, and others, to quantify and track how you’re doing. 

If you want to learn more about effectively running your platform, don’t miss SpringOne Platform, October 7-10 in Austin. (Use discount code S1P_Save200 to save $200 on registration.) The agenda is packed with seasoned practitioners that’ll share their best practices and lessons learned along their platform journey.

You can also check-out these resources:

About the Author

Parker Fleming is director of Tanzu Practice at VMware.

More Content by Parker Fleming
Why Kubernetes and Pivotal Container Service 1.5 is the Cure for Your Windows Server 2008 Headaches
Why Kubernetes and Pivotal Container Service 1.5 is the Cure for Your Windows Server 2008 Headaches

This Month in Spring - July 2019
This Month in Spring - July 2019