When it comes to visualizing hierarchically organized data, you often see
Indented Charts
and Icicle PlotsThis interactive article presents these two types of visualization and discusses their characteristics. Trying to answer what makes such a visualization effective in practical application, we will reflect on options for label placement, interactivity, and scaling the approaches to larger datasets.
Known from file browsers and other tools, indented charts are so simple that they might not even be considered a visualization, but just a user interface component.
Elements of the hierarchy are vertically listed below each other, with increasing horizontal indentation per level. Child elements directly follow their parents.
Let us visualize a simple hierarchy that shows a directory of files as an indented chart.
Just the labels of the files and indentation is sufficient to see the hierarchical structure.
Additional lines can mark the levels of the hierarchy and help us trace which child nodes belong to a parent.
We can draw circles to mark each node more clearly.
These can be used, for instance, to encode the type of the node as color.
And, very useful for files, the area of the circles may encode the sizes of the respective files and directories.
If you try to find more information on this representation, unfortunately, you will quickly learn that it does not have one name consistently used in all sources, but different ones. I decided here to go for indented chart as it appears to be one of the most frequently used ones and is quite descriptive—other common names are indented plot, indented layout, indented tree, and tree view.
The visualization is often used as an interactive component, allowing users to expand and collapse inner nodes of the hierarchy. Being collapsed at the beginning, for instance, in file browsers, even huge hierarchies can be interactively explored. Only rarely, additional numeric attributes are shown in such interfaces, but they exist across various domains. For instance, TreeSize is a tool that, similar to our example above, visualizes file sizes within the indented hierarchy.
Furthermore, indentation of program code can be considered a variant of this. Many code editors visually indicate the levels by vertical lines and allow for collapsing parts of the hierarchical program structure.
Generally, an indented chart can be considered a variant of a node-link diagram, in a simple layout where all nodes are placed below each other. The layout algorithm is trivial to implement. In contrast to other node-link layouts, it consumes more vertical but little horizontal space. This might be perceived as a disadvantage, but also allows to just add vertical scrolling when the diagram outgrows the available screen space.
You may have noticed that the list-like layout makes it straightforward to print labels for each node, and there is sufficient space for longer labels. In contrast, most other node-link layouts, though being slightly more compact, run into certain problems for placing labels. Especially difficult to label are those that use a radial layout, also because rotated texts get more difficult to read.
There are not many user studies comparing the two representations. However, the few studies that investigate variants of the two suggest a similar performance [1][2]. So, if readability can be assumed as comparable, it will be more about practical issues like label placement when selecting between the two approaches.
Indented charts are simple to implement, known by every user, allow for interactive exploration of large hierarchies, and do not cause issues when printing labels.
In contrast to indented charts, an icicle plot focuses more on providing a good overview of the sizes of the hierarchy nodes. In this regard, it is similar to a treemap where the size of the nodes is encoded in the rectangular area they fill. But instead of nesting the nodes, the icicle plots work in layers. They start with a full-width rectangle and attach the child nodes below their parent as rectangles, scaled according to their sizes.
We can use the small file structure again as an example and see how it looks as an icicle plot. Yet, let us first assume that all files in our example have the same size.
In the resulting base layout, the root node is placed at the top layer at full width. The layer below is subdivided into the respective directories and files in a way that the directories consume a width proportional to the number of contained files, and so on.
We have encoded already the levels of the hierarchy with slightly decreasing opacity, but again, we can also use colors to discern the different types of files.
And finally, the nodes can be scaled according to the respective file sizes. As a consequence, the widths of directory nodes also adapt and match the summed sizes of the contained files.
From a distance, these vertically appended rectangles somewhat resemble hanging icicles; the term goes back to a paper from the early 1980s [3]. If arrange from bottom to top instead (and colored in red-orange colors), they more look like flames and are sometimes called flame graphs [4]. But, of course, we could also choose a layout from left to right or right to left—these do not have specific names.
You might have noticed that placing labels is sometimes tricky in an icicle plot. For bigger nodes, there is usually enough space to print the full label. But for smaller ones, they need to be cut or cannot be printed at all. However, depending on the application, this might not be an issue if size corresponds to importance. For instance, investigating which files consume most space on our hard disk, we would not be interested in the small files and their names anyway.
Still, finding the best possible labeling result is desirable. Rotating individual labels by 90 degree might, in some cases, allows printing longer labels at the costs of a poorer readability of the vertical labels. Alternatively, rotating the whole plot with nodes arranged from left to right is an option to ease labeling. However, which layout works best can only be decided depending on the available screen space, the lengths of the labels, and the size and structure of the hierarchy.
Size + size = sizeFor file sizes, the division of rectangles in the plot works nicely and is intuitive to understand, as the sum of the sizes of the child nodes is the size of the parent. If this property does not hold, however, the icicle encoding might not be recommendable, for instance, if the parent node can have a size that is independent of its child nodes. Or, if the sizes are on a log scale, the sum of the sizes of the children would no longer be the size of the parent. Hence, we have to be careful to not blindly use any numeric attribute as the values encoded in an icicle plot.
Zooming inWhile simple expand and collapse operations can make an indented plot scale to large hierarchies, this strategy is of limited use for an icicle plot. Collapsing a node would not reduce the size of its parent, and hence, would not free any screen space. Instead, we can use a zooming approach. Unlike a traditional pan-and-zoom in two dimensions—like used for maps and images—, the zoom can be restricted to one dimension only because, in most cases, only the breadth and not the depth of the hierarchy limits the readability of an icicle plot.
Below is an example of a larger hierarchy (the file example above copied 50 times with varying file sizes) with vertical zooming and scrolling, implemented through two sliders.
Zoom
Scroll
The example shows that, in the zoomed-out version, the structure of the hierarchy is still readable. You can see which are the bigger branches and nodes of the hierarchy and how the differently colored file types spread in the hierarchy levels. Zooming in reveals the individual directory and file names, as well as allows exploring the smaller nodes.
Please note the extra shading of the background of the nodes—a white gradient on the left side of each rectangle, creating a kind of cushion effect [5]. This helps discern the rectangles vertically on higher zoom levels. Borderlines around the rectangles, in contrast, would have cluttered the diagram.
Timelines of nested callsIf you want to see the approach in action, I recommend using performance profiling tools (e.g., open as available in the developer tools of your fovorite web browser). Most of these tools use timelines of nested calls as icicle plots. Here, the nodes represent the call of a method, hierarchically stacked according to a call tree. The sizes of the nodes correspond to their execution times. So, when arranging the nodes on the timeline in the correct hierarchy level, a temporal icicle plot is created.
The radial variant of an icicle plot, often called sunburst chart, might be even somewhat more commonly presented and used. It has slightly different properties. Here, the root is a circle in the center and all child nodes are ring segments at the respective radial layer.
The below diagram shows the above file system example drawn as a sunburst chart with radially placed labels.
The nodes are subdivided not by width but by angle. This results in nodes being disproportionately smaller towards the center. However, interpreting each ring at a layer as 100% of a quantity, the relative sizes are interpretable, somewhat like in a pie chart where the full pie usually also represents 100%. Hence, node sizes are comparable within a layer, but not across layers—and angles might be harder to compare.
Labeling, like usual in radial visualizations, becomes more challenging and one-dimensional zooming cannot be implemented as described above; however, more space is available for the outer nodes at deeper levels of the hierarchy.
Overall, I would still recommend the standard icicle plot as a default, but also aside the aesthetics, in some cases, a sunburst chart might be preferable to show that node sizes are a fraction of a whole.
Treemaps and icicle plots both focus on showing size attributes of the nodes. Also, treemaps rely on the assumption that sizes add up linearly for inner nodes as discussed above for icicle plots. Both representations are sometimes characterized as space-filling [6]—they can fill any rectangular space. However, I find this characterization misleading for icicle plots. While this is perfectly true for treemaps, an icicle plot would usually still have some whitespace. Also, it is not desirable to use a rectangle of any aspect ratio for an icicle plot, but rather to select a specific viewport that matches the data at hand and the orientation of the plot. In this regard, a treemap is more flexible.
However, in my eyes, treemaps are often overrated. They come with clear disadvantages: First, they partly obfuscate the hierarchical structure—within the recursively nested boxes, it is challenging to perceive the inner nodes, which are the structure-forming elements of the hierarchy. One can try to counterbalance, for instance, by using cushion shading to better highlight the structure [7], but the general problem remains. Second, interactive exploration is much harder to implement and not as easy to use in treemaps. Like in icicle plots, simple collapsing and expanding of nodes would conflict with the correct sizing of the rectangles. Using two-dimensional pan-and-zoom is possible, but would ignore the hierarchical structure and obfuscate it even more. Empirical studies also provide some evidence of lower performance and user preference of treemaps compared to icicle plots [8][9].
Many variants exist of the two approaches that, in contrast to the explicit links in node-link diagrams, visualize the hierarchical structure implicitly [10].
Icicle plots are great to clearly visualize a linear size attribute of the nodes. Maybe not known to every user already, they are not difficult to understand either. Combined with one-dimensional zooming, they scale well to visualizing large hierarchies.
Indented charts and icicle plots are two hierarchy visualization approaches that can be considered as simpler but not less effective versions of node-link diagrams and treemaps. While the first mainly shows the hierarchical structure, the second focuses on visualizing aggregated size attributes of the nodes. Especially when it comes to placing labels and scaling the visualizations to larger hierarchies, I see clear advantages over node-link diagrams and treemaps, as well as over the radial sunburst charts. Indented charts and icicle plots are both easy to implement and available in many visualization libraries.
For following up on more specific hierarchy visualization approaches or with implementing an own solution, I recommend the following sources, besides the references below.