How to Get Scalable Distributed Tracing for Istio+Envoy

February 15, 2019 Chhavi Nijhawan

Service meshes like Istio+Envoy and App Mesh+Envoy are configurable infrastructure layers for microservices-based applications. They make communication between service instances flexible, reliable, and fast. Distributed tracing is a critical tool for debugging and understanding microservices. It enables users to track a request across multiple services, databases as well as intermediaries like proxies. This blog will focus on how to get distributed tracing for Istio – the most popular Service Mesh. There are many other aspects of a Service Mesh, including operation, data and control planes, policy control and telemetry pieces such as metrics and logs. For investigating these attributes further, check out this blog.

Understanding Distributed Tracing for Istio

Each service in Istio interacts with other services via its Envoy proxy. The proxy intercepts every HTTP1.1/2, gRPC or TCP interaction. It logically calls Istio Mixer, the component responsible for policy control and telemetry collection, before each request to perform precondition checks, and after each request to report telemetry.

Distributed tracing for Istio

Envoy proxy provides the capability for reporting tracing information regarding communications between services in the mesh. But, to be able to generate a correlated trace view showing the request flow across various services and proxies, the service must propagate certain trace context between the inbound and outbound request. For every request without any tracing header, Envoy proxy creates a root span and inserts the span context. When the proxy encounters a request with existing trace headers, instead of creating a new root span, it extracts the span context and inserts a child span, thereby enabling the tracer backends e.g. Zipkin or Wavefront to correlate and create the full trace view. With a simple configuration, distributed traces collected by Mixer can be forwarded to on-premises based open source solutions or cloud-native distributed solution like Wavefront.

Wavefront Distributed Tracing for Istio

Wavefront provides a scalable, cloud-native distributed tracing solution for Istio. With just a couple of commands, distributed traces from Istio can be directed to Wavefront. Once the traces are redirected (in Zipkin B3-header format), Wavefront can be used to view, query and analyze traces and the corresponding response time, error and duration (RED) metrics derived from Istio traces. Below is the trace view and the corresponding RED metrics view for product page of Istio sample bookInfo app in Wavefront.

Out-of-the-box metrics and histograms derived from Istio traces

Detailed trace view

Advantages of Wavefront Distributed Tracing for Istio

Here are some of the advantages of using Wavefront distributed tracing for Istio:

  • Get metrics, histograms in addition to traces (3D Observability) for faster troubleshooting – In addition to viewing and querying traces, Wavefront also derives RED metrics and histograms from traces. Metrics give you the first indication of an issue, enabling you to create alerts and run analytics. Histograms help you to understand the magnitude of the issue with various percentile. Together 3D observability enables SREs and DevOps teams to quickly troubleshoot issues.
  • No servers to update, patch and scale – Common open source distributed tracing solutions are on-premises based solutions requiring you to maintain, update and scale pods/servers for different components e.g. collector and datastore. Wavefront is a cloud-native solution so there is no need to maintain or scale any servers.
  • Scale tested for millions of traces, metrics, and histograms – Wavefront is built to support millions of traces, metrics, and histograms. Customers like Lyft and Centrica Hive have reached fantastic scale with Wavefront.
  • Retain relevant information even after sampling – In high traffic systems, capturing and retaining every trace can be expensive and not super useful. Intelligent sampling can reduce cost, but you might lose some information. With Wavefront, even if you sample, we keep metrics for all traces, you get the full view even after sampling.
  • High availability – Ensuring high availability of on-premises based open-source solutions is a challenge. Wavefront, a cloud-native monitoring and analytics platform, provides a highly available observability solution.
  • Eliminate context switching – SREs often need to refer to multiple solutions to get visibility into metrics, histograms, and traces. For instance, Service Mesh users often maintain Prometheus for metrics monitoring and Jaeger/Zipkin for traces. Wavefront provides a single platform with a unified view of metrics, histograms and metrics thereby reducing the time spent in context switching while troubleshooting critical issues.
  • Multiple clusters visibility – Service Mesh users often have to deploy an instance of metrics monitoring solutions and a distributed tracing solution for each Kubernetes cluster. With Wavefront, you can get visibility into traces, histograms, and metrics from different clusters in one platform, enabling you to compare performance across Kubernetes clusters.

Send Istio Traces to Wavefront With Just Two Commands

You only need two commands to redirect Istio traces to Wavefront. From your Kubernetes cluster, run the below steps:

  1. Delete existing Zipkin service

kubectl delete svc zipkin -n istio -system

  1. Deploy new Zipkin service

kubectl apply -f Zipkin-svc-redirect.yml

Above mentioned steps and the corresponding YAML file can be found here. Once the above steps are implemented, Istio Mixer starts sending spans in Zipkin format (i.e. with b3-propagation headers) to Wavefront proxy. Wavefront proxy understands Zipkin trace data format, enabling customers to view and analyze Istio traces.

Summary

Wavefront provides a cloud-native scale tested distributed tracing solution for Istio+Envoy. As compared to popular open-source distributed tracing solutions, you don’t need to maintain any servers with Wavefront and you get 3D observability with out-of-the-box traces, metrics, and histograms. Additionally, Wavefront supports millions of traces, metrics, and histograms, so you don’t have to worry about scaling your observability solution as your application traffic grows. Enroll in Wavefront distributed tracing beta today and enjoy a scalable distributed tracing solution for Istio+Envoy.

The post How to Get Scalable Distributed Tracing for Istio+Envoy appeared first on Wavefront by VMware.

About the Author

Chhavi Nijhawan

Chhavi is a Product Line Marketing Manager at Wavefront by VMware. Before Wavefront, she worked at New Relic, SnapLogic and Cisco, where she led product marketing and technical marketing. She has over 10 years of IT industry experience. She is also an AWS certified solutions architect.

Follow on Twitter More Content by Chhavi Nijhawan
Previous
Pivotal Application Service on PCF Monitoring with Wavefront
Pivotal Application Service on PCF Monitoring with Wavefront

Pivotal Application Service (PAS) is based on Pivotal Cloud Foundry (PCF). With PAS on PCF DevOps and devel...

Next
Here’s Why You Need Service Mesh Observability
Here’s Why You Need Service Mesh Observability

Historically, applications serving large volumes of user transactions were first developed as monolithic ap...