Hello World, Meet VMware CRE

The world of computing has changed. Applications are a critical part of everyday life and downtime can have a disastrous effect on businesses. In a world of “Open 24/7” and “It’s always up,” expectations of reliability have risen dramatically. While Kubernetes comes pre-built with several reliability features (e.g., ReplicaSets, highly available control planes, etc.), organizations need a dedicated focus to achieve their business goals of reliability. This is where VMware Customer Reliability Engineering (CRE) comes in.

What is CRE?

With CRE, we take the principles of Site Reliability Engineering like service-level objectives (SLOs), error budgets and eliminating toil and partner with our customers to help them adopt those principles into their Kubernetes environments.

Unlike traditional tech support models that strictly focus on reactive help via tickets, we take a foundation-building approach to your environment so we can solve problems proactively, with the goal of solving the issues before they page someone on your team. CRE seeks to understand what problems you need Kubernetes to solve for you, what your long-term objectives with the platform are and how we can help you achieve those objectives—including your biggest Kubernetes pain points.

We apply our combined expertise in running large, complex Kubernetes environments and proactive support models to help you meet that high uptime requirement.

How does CRE help you?  

When we first meet you as a customer, we assign to you a dedicated program manager who is responsible for ensuring we deliver our engagement model in accordance with your goals. Our engagement model centers around deep-diving into your Kubernetes architecture, identifying risks to your reliability and working side-by-side with you to mitigate or eliminate those risks. Risks can include the existence of deployment anti-patterns, relying on alpha features in production, a lack of high-fidelity monitoring and alerting at the infrastructure and Kubernetes layers, and more. This model has potential to positively impact your reliability, and we provide you with a pager that allows you to contact a customer reliability engineer 24/7 to help you meet your reliability objectives.

We also provide you with a customer portal filled with technical documentation that is written and tested by our engineers and is specific to your use case(s). Finally, we help you navigate the cloud-native ecosystem with a robust set of tools that provide expertise through our extended Support Matrix.

Whether you are determining which tools in the ecosystem best meet your business needs, preparing for an upcoming event that generates peak user load, or planning to move clusters into production, we have your back, and we are carrying a pager for you.

How can I get started?

Are you ready to improve your Kubernetes journey by partnering with experts who are always within reach? This brief is a great source for more information and, of course, you can always reach out to VMware directly to get started with CRE.


About the Author

Jed Salazar

Jed Salazar started his Site Reliability Engineer journey working on Borg clusters at Google. He's passionate about SRE and spreading those practices as a Customer Reliability Engineer. In his free time he enjoys trail running the mountains of Boulder, Colo.

More Content by Jed Salazar
Building on Developer Platforms to Deliver the Best Developer Experience
Building on Developer Platforms to Deliver the Best Developer Experience

Managing Kubernetes at Enterprise Scale: A Closer Look at Tanzu Mission Control
Managing Kubernetes at Enterprise Scale: A Closer Look at Tanzu Mission Control

Manage Kubernetes clusters across cloud with Tanzu Mission Control.