See live demos of modern application development technologies.
Jul 20, 2021
Starting in April of 2020 my team was tasked with managing Tanzu Application Service on multiple foundations for a client. Early on it was a priority to establish a strong SRE practice around managing the platform. This talk discusses how we defined key metrics for monitoring availability, custom solutions for populating availability data into an observability platform (Tanzu Observability by Wavefront), dashboard creating, and alerting practices. We discuss in depth the benefits of using a burn rate when monitoring availability error budget consumption, and how this strategy allows for more sensitive alerting and limiting error budget consumption. This presentation will demonstrate how the cultivation of availability charts and error budget burn rate alerting creates an environment where the data starts working for your team. We emphasize the intentional use of availability error budgeting for backlog prioritization and embracing risk when managing a platform.
Tiffany is a senior developer advocate at VMware and is focused on Kubernetes. She previously worked as a software developer and developer advocate (nerd whisperer) for containers at Amazon. She also formerly worked at Docker and Intel. Prior to that, she graduated from Georgia Tech with a degree in electrical engineering. In her free time she really likes to travel and dabble in photography. You can find her on Twitter @tiffanyfayj.