Understanding the Relationship Between Two Types of Plans
Debuting in Greenplum Command Center 4.2 is a new beta feature—the visual explain plan. The plan’s purpose is to visually represent the steps, or operations, to return query results. This provides an easier way to see the progress of steps and substeps in a query. In Command Center, the visual explain plan is found under the Plan & Progress tab of a query’s detail page.
Gif of an explain plan in GPCC
A note: this post won’t go into detail about what each step means, or how to tell if there are problems with the plan. You can learn more about that here, here or here. Rather, this post focuses on showing the relationship between a visual plan and a textual plan.
Knowing the relationship between these two explain plan formats will help you utilize both plans as you investigate a query. The visual plan provides a quick way to view progress and check performance as well as read large plans. The textual plan enables you to see details of each step quickly and scan for certain keywords.
We’ll explain how these two plans relate through two concepts: the order of the plan steps and slices.
The order of explain plan steps
In a textual explain plan, the first step to complete is the bottom-most indented step. Steps are also indented, with the first step of that grouping indented the most.
For example, in this snapshot of a textual explain plan, Seq Scan on nation would be the first step to start, followed by the Broadcast Motion, then the Hash. Seq Scan on supplier would be running concurrently with the other three steps to meet with the Hash. Only when all these steps complete can Hash Join be completed as well.
One thing to note is that all the steps will start at the same time, however, their completion time varies depending on when each step receive the tuples, or the number of rows, to run the action.
In a visual explain plan, reading from bottom to top is preserved. However, indents manifest as branches with the plan visualization. What would be the most indented step in the textual explain becomes the last step of a branch in a visual explain.
Slices are portions of the query plan that are processed in parallel by the segments.
Greenplum is a massively parallel processing (MPP) shared nothing data management system. In other words, a table is split into smaller disjoint subsets on different segments of the data cluster. This allows query processing to done in parallel across the segments and return results super fast.
In order for tuples to move from one segment to another or gather results from all segments to one place, GPDB introduces the concept of motion nodes. Therefore, you can usually tell when a new slice begins when you see one of the three types of motion nodes (gather, redistribute, broadcast). Learn more about that here.
Below is the full explain plan of the excerpt above, color-coded with slices.
As you can see, slices aren’t necessarily linear with the organization of the explain text and can be split through the plan (like slice 4). However, in a visual plan, the structure is clearer:
What’s made clearer in the visual explain plan are how some steps depend on other steps to begin first. In a textual format, the linear nature makes it difficult to describe the flow from step to step and how slices are grouped. In a visual explain, it’s much easier see slices and flow.
But what does this have to do with executing a query and explain plans? When a query hits Greenplum, a few things happen. It is translated by the optimizer to determine the best sequence of steps for the query to take, and then the steps are grouped into the most efficient slices. The optimizer also calculates the tuples, to be processed in each step. Only then does the query start. With the visual or textual explain plan, you can determine what part of the query is causing inefficiencies and edit the query. To learn more about optimizing queries, check out Optimizing Greenplum Performance and Tuning Greenplum Database.
The visual plan and textual plan are both ways to view the execution process for the query. With the visual plan, you’re able to quickly view the progress of the query and see how long each step is active. With the textual explain, you’re able to see cost and rows being processed. By using both for what they’re created for, you can better understand and analyze your query.
As we continue to improve the visual explain plan in Command Center, you’ll see more features like slice visualization to help understand query plans. Stay tuned!
Thanks to Shreedhar Hardikar, Venkatesh Raghavan, and Hao Wang for teaching me these things and Jing Li, Eric Cipra, and Mike McDearmon for feedback!
About the AuthorFollow on Linkedin