VMworld 2021 is upon us. Today, we are announcing several updates that improve VMware Tanzu Observability by Wavefront’s ability to deliver analytics-driven insights for site-reliability engineers (SREs), developers, and platform teams. We have added artificial intelligence and machine learning (AI/ML) root cause capabilities, enhanced our support for Prometheus Query Language (PromQL), revamped our alerting process, and created an integration between Tanzu Observability and VMware vRealize Operations Cloud. Continue reading below for more detail, and be sure to check out VMworld Session APP 1308 Observability for Modern Application and Kubernetes Environments for more details on each of these announcements.
Using machine learning to focus troubleshooting efforts
Today we are announcing the beta release of Automated Probable Root Cause. Distributed applications built using microservices are complex and create lots of telemetry data. When performance issues occur, it's hard to know where to start troubleshooting. This new feature will help DevOps and SRE teams quickly troubleshoot service and application health issues without having to examine thousands of traces. It uses proprietary algorithms to analyze and surface the root cause of service and operations performance issues.
In the above example, when a poorly performing service is identified, you can right-click on the service within the Application Map and choose “Perform RCA”. The algorithms use the span data to calculate the likely root causes of the problem. You can view these results in the “Insights” window. When you click on one of the results, it takes you to the Traces Browser (shown below), where you can view the trace details, highlighting the traces with the information on the root cause of the performance or errors of the service.
We are excited to work with select customers during this beta release and look forward to rolling it out to all customers in the near future.
Enhanced PromQL support limits disruption when migrating
Enterprises adopting Kubernetes typically use open source Prometheus and Grafana for metrics collection and data visualization. But running Prometheus at scale can be challenging. Tanzu Observability offers many features that address these challenges and the needs of enterprises, such as fine-grain security controls, high availability, and 18 months of data retention with original data resolution granularity. Besides metrics, it also adds the ability to ingest traces, spans, events, and histograms.
At VMworld 2020, we announced support for PromQL in Tanzu Observability to help unify open source monitoring with enterprise observability. Today, we are announcing that we will improve the support for PromQL with the forthcoming addition of PromQL HTTP API support.
PromQL support makes it easy for you to migrate your Prometheus data to Tanzu Observability. It allows developers, SREs, and Kubernetes platform operators to continue to use familiar Prometheus queries to power Tanzu Observability dashboards and alerts.
But moving to an entirely new solution can be disruptive. Even if the query language is familiar, asking teams to adjust to a new interface and workflow can affect productivity in the short term. PromQL HTTP API support will help limit disruptions in a few ways:
Teams can use their existing tooling, like Grafana, that talks to Prometheus. Just point them at Tanzu Observability and they will work.
Autocomplete of PromQL functions and operators works in these tools.
There is no immediate need to learn a new tool or query language. Over time, teams can explore the added benefits of Tanzu Observability, including the powerful Wavefront Query Language (WQL) and the 220+ out-of-the-box integrations available.
We know it is important that queries return the right results. We are currently working with PromLabs on third-party validation of the accuracy of the PromQL translation to Wavefront Query Language. Once complete, they will publish the results on their website.
Better alerts mean better results
A user-friendly alerting experience helps modern DevOps teams achieve faster incident resolution times by filtering noise and capturing true anomalies. Alerting in Tanzu Observability is enabled by a powerful query language and the ability to ingest data in real time. Given that software development continues to happen faster and faster, and the underlying infrastructure is getting more complex and dynamic, we’ve designed a better experience with several improvements.
Simpler creation process – Follow a five-step process for creating alerts that are more accurate, informational, and actionable.
Test and tune alerts for accuracy – Teams can now test alerts against 18 months of historical data to understand how they are likely to perform in the future. Use the results to fine-tune the accuracy of an alert before enabling it.
More context – Add links to relevant content, like runbooks and dashboards, to troubleshoot faster.
And more feature enhancements related to alerting are on their way. Stay tuned!
Tanzu Observability and vRealize Operations Cloud integration
Tanzu Observability can now ingest vRealize Operations Cloud metrics. Thousands of VMware customers use vRealize Operations for self-driving IT operations management for private, hybrid, and multi-cloud environments. vRealize Operations Cloud is the SaaS version providing the same features and benefits. Through this integration, developers and SREs can now view vRealize Operations Cloud metrics alongside all the metrics, histograms, and traces collected by Tanzu Observability from other sources for a more holistic view of business-critical applications and infrastructure. Using the power of the Wavefront Query Language and the newly released support for PromQL, customers can correlate their data to understand the impact of the SDDC infrastructure on their applications, thus eliminating blind spots and helping reduce mean time to detection and mean time to repair.
This is the first of many integrations planned between Tanzu Observability and the vRealize Suite. Try it out and let us know what you would like to see in the future.
Learn more about Tanzu Observability at VMworld
If you are attending VMworld, check out the sessions below to learn more about Tanzu Observability.
- APP1308: Observability for Modern Application and Kubernetes Environments
- APP2648: Implement Observability for Kubernetes Clusters and Workloads in Minutes
- VI2630: Best Practices and Reference Framework for Implementing Observability
- UX2551: Move from Traditional Monitoring to Observability and SRE – Design Studio
- VMTN2810: Lost in Containers? Enhance Observability with Actionable Visualization
- 2965: Kubernetes Cluster Operations, Monitoring and Observability
- 2957: Build a Data Analytics Platform in Minutes Using Deployment Blueprints
- APP2677: Meet the Experts: VMware Tanzu Observability by Wavefront
- VMTN3230: Observe Application internals Holistically
- VI1448: Take a Modern Approach to Achieve Application Resiliency
- APP1319: Transforming Customer Experiences with VMware’s App Modernization Platform
Disclaimer: VMware makes no guarantee that services announced in preview or beta will become available at a future date. The information in this press release is for informational purposes only and may not be incorporated into any contract. This article may contain hyperlinks to non-VMware websites that are created and maintained by third parties who are solely responsible for the content on such websites.
About the AuthorFollow on Twitter More Content by Scott Kelly