Visualizations Overview

Performance profiles originate either from the user’s own means (i.e. by building their own collectors and generating the profiles w.r.t Specification of Profile Format) or using one of the collectors from Perun’s tool suite.

Perun can can interpret the profiling data in several ways:

  1. By directly running interpretation modules through perun show command, that takes the profile w.r.t. Specification of Profile Format and uses various output backends (e.g. Bokeh, ncurses or plain terminal). The output method and format is up to the authors.
  2. By using python interpreter together with internal modules for manipulation, conversion and querying the profiles (refer to Profile API, Profile Query API, and Profile Conversions API) and external statistical libraries, like e.g. using pandas.

The format of input profiles has to be w.r.t. Specification of Profile Format, in particular the intepreted profiles should contain the resources region with data.

Automatically generated profiles are stored in the .perun/jobs/ directory as a file with the .perf extension. The filename is by default automatically generated according to the following template:

bin-collector-workload-timestamp.perf

Refer to Command Line Interface, Automating Runs, Collectors Overview and Postprocessors Overview for more details about running command line commands, generating batch of jobs, capabilities of collectors and postprocessors techniques respectively. Internals of perun storage is described in Perun Internals.

Note that interface of show allows one to use index and pending tags of form i@i and i@p respectively, which serve as a quality-of-life feature for easy specification of visualized profiles.

_images/architecture-views.svg

Supported Visualizations

Perun’s tool suite currently contains the following visualizations:

  1. Bars Plot visualizes the data as bars, with moderate customization possibilities. The output is generated as an interactive HTML file using the Bokeh library, where one can e.g. move or resize the graph. Bars supports high number of profile types.
  2. Flow Plot visualizes the data as flow (i.e. classical continuous graph), with moderate customization possiblities. The output is generated as an interactive HTML file using the Bokeh library, where one can move and resize the graph. Flow supports high number of profile types.
  3. Flame Graph is an interface for Perl script of Brendan Gregg, that converts the (currently limited to memory profiles) profile to an internal format and visualize the resources as stacks of portional resource consumption depending on the trace of the resources.
  4. Scatter Plot visualizes the data as points on two dimensional grid, with moderate customization possibilities. This visualization also display regression models, if the input profile was postprocessed by Regression Analysis.
  5. Heap Map visualizes the memory consumption as a heap map of allocation resources to target memory addresses. Note that the output is dependent on ncurses library and hence can currently be used only from UNIX terminals.

All of the listed visualizations can be run from command line. For more information about command line interface for individual visualization either refer to Collect units or to corresponding subsection of this chapter.

For a brief tutorial how to create your own visualization module and register it in Perun for further usage refer to Creating your own Visualization. The format and the output is of your choice, it only has to be built over the format as described in Specification of Profile Format (or can be based over one of the conversions, see Profile Conversions API).

Bars Plot

Bar graphs displays resources as bars, with moderate customization possibilities (regarding the sources for axes, or grouping keys). The output backend of Bars is both Bokeh and ncurses (with limited possibilities though). Bokeh graphs support either the stacked format (bars of different groups will be stacked on top of each other) or grouped format (bars of different groups will be displayed next to each other).

Overview and Command Line Interface

perun show bars

Customizable interpretation of resources using the bar format.

  • Limitations: none.
  • Interpretation style: graphical
  • Visualization backend: Bokeh

Bars graph shows the aggregation (e.g. sum, count, etc.) of resources of given types (or keys). Each bar shows <func> of resources from <of> key (e.g. sum of amounts, average of amounts, count of types, etc.) per each <per> key (e.g. per each snapshot, or per each type). Moreover, the graphs can either be (i) stacked, where the different values of <by> key are shown above each other, or (ii) grouped, where the different values of <by> key are shown next to each other. Refer to resources for examples of keys that can be used as <of>, <key>, <per> or <by>.

Bokeh library is the current interpretation backend, which generates HTML files, that can be opened directly in the browser. Resulting graphs can be further customized by adding custom labels for axes, custom graph title or different graph width.

Example 1. The following will display the sum of sum of amounts of all resources of given for each subtype, stacked by uid (e.g. the locations in the program):

perun show 0@i bars sum --of 'amount' --per 'subtype' --stacked --by 'uid'

The example output of the bars is as follows:

                                <graph_title>
                        `
                        -         .::.                ````````
                        `         :&&:                ` # \  `
                        -   .::.  ::::        .::.    ` @  }->  <by>
                        `   :##:  :##:        :&&:    ` & /  `
        <func>(<of>)    -   :##:  :##:  .::.  :&&:    ````````
                        `   ::::  :##:  :&&:  ::::
                        -   :@@:  ::::  ::::  :##:
                        `   :@@:  :@@:  :##:  :##:
                        +````||````||````||````||````

                                    <per>

Refer to Bars Plot for more thorough description and example of bars interpretation possibilities.

perun show bars [OPTIONS] <aggregation_function>

Options

-o, --of <of_key>

Sets key that is source of the data for the bars, i.e. what will be displayed on Y axis. [required]

-p, --per <per_key>

Sets key that is source of values displayed on X axis of the bar graph.

-b, --by <by_key>

Sets the key that will be used either for stacking or grouping of values

-s, --stacked

Will stack the values by <resource_key> specified by option –by.

-g, --grouped

Will stack the values by <resource_key> specified by option –by.

-f, --filename <filename>

Sets the outputs for the graph to the file.

-xl, --x-axis-label <x_axis_label>

Sets the custom label on the X axis of the bar graph.

-yl, --y-axis-label <y_axis_label>

Sets the custom label on the Y axis of the bar graph.

-gt, --graph-title <graph_title>

Sets the custom title of the bars graph.

-v, --view-in-browser

The generated graph will be immediately opened in the browser (firefox will be used).

Arguments

<aggregation_function>

Optional argument

Examples of Output

_images/complexity-bars.png

The Bars Plot above shows the overall sum of the running times for each structure-unit-size for the SLList_search function collected by Trace Collector. The interpretation highlights that the most of the consumed running time were over the single linked lists with 41 elements.

_images/memory-bars-stacked.png

The bars above shows the stacked view of number of memory allocations made per each snapshot (with sampling of 1 second). Each bar shows overall number of memory operations, as well as proportional representation of different types of memory (de)allocation. It can also be seen that free is called approximately the same time as allocations, which signifies that everything was probably freed.

_images/memory-bars-sum-grouped.png

The bars above shows the grouped view of sum of memory allocation of the same type per each snapshot (with sampling of 0.5 seconds). Grouped pars allows fast comparison of total amounts between different types. E.g. malloc seems to allocated the most memory per each snapshot.

Flame Graph

Flame graph shows the relative consumption of resources w.r.t. to the trace of the resource origin. Currently it is limited to memory profiles (however, the generalization of the module is in plan). The usage of flame graphs is for faster localization of resource consumption hot spots and bottlenecks.

Overview and Command Line Interface

perun show flamegraph

Flame graph interprets the relative and inclusive presence of the resources according to the stack depth of the origin of resources.

  • Limitations: memory profiles generated by Memory Collector.
  • Interpretation style: graphical
  • Visualization backend: HTML

Flame graph intends to quickly identify hotspots, that are the source of the resource consumption complexity. On X axis, a relative consumption of the data is depicted, while on Y axis a stack depth is displayed. The wider the bars are on the X axis are, the more the function consumed resources relative to others.

Acknowledgements: Big thanks to Brendan Gregg for creating the original perl script for creating flame graphs w.r.t simple format. If you like this visualization technique, please check out this guy’s site (http://brendangregg.com) for more information about performance, profiling and useful talks and visualization techniques!

The example output of the flamegraph is more or less as follows:

                    `
                    -                         .
                    `                         |
                    -              ..         |     .
                    `              ||         |     |
                    -              ||        ||    ||
                    `            |%%|       |--|  |!|
                    -     |## g() ##|     |#g()#|***|
                    ` |&&&& f() &&&&|===== h() =====|
                    +````||````||````||````||````||````

Refer to Flame Graph for more thorough description and examples of the interpretation technique. Refer to perun.profile.convert.to_flame_graph_format() for more details how the profiles are converted to the flame graph format.

perun show flamegraph [OPTIONS]

Options

-f, --filename <filename>

Sets the output file of the resulting flame graph.

-h, --graph-height <graph_height>

Increases the width of the resulting flame graph.

_images/memory-flamegraph.png

The Flame Graph is an efficient visualization of inclusive consumption of resources. The width of the base of one flame shows the bottleneck and hotspots of profiled binaries.

Examples of Output

Flow Plot

Flow graphs displays resources as classic plots, with moderate customization possibilities (regarding the sources for axes, or grouping keys). The output backend of Flow is both Bokeh and ncurses (with limited possibilities though). Bokeh graphs support either the classic display of resources (graphs will overlap) or in stacked format (graphs of different groups will be stacked on top of each other).

Overview and Command Line Interface

perun show flow

Customizable interpretation of resources using the flow format.

  • Limitations: none.
  • Interpretation style: graphical, textual
  • Visualization backend: Bokeh, ncurses

Flow graph shows the values resources depending on the independent variable as basic graph. For each group of resources identified by unique value of <by> key, one graph shows the dependency of <of> values aggregated by <func> depending on the <through> key. Moreover, the values can either be accumulated (this way when displaying the value of ‘n’ on x axis, we accumulate the sum of all values for all m < n) or stacked, where the graphs are output on each other and then one can see the overall trend through all the groups and proportions between each of the group.

Bokeh library is the current interpretation backend, which generates HTML files, that can be opened directly in the browser. Resulting graphs can be further customized by adding custom labels for axes, custom graph title or different graph width.

Example 1. The following will show the average amount (in this case the function running time) of each function depending on the size of the structure over which the given function operated:

perun show 0@i flow mean --of 'amount' --per 'structure-unit-size'
    --acumulated --by 'uid'

The example output of the bars is as follows:

                                <graph_title>
                        `
                        -                      ______    ````````
                        `                _____/          ` # \  `
                        -               /          __    ` @  }->  <by>
                        `          ____/      ____/      ` & /  `
        <func>(<of>)    -      ___/       ___/           ````````
                        `  ___/    ______/       ____
                        -/  ______/        _____/
                        `__/______________/
                        +````||````||````||````||````

                                  <through>

Refer to Flow Plot for more thorough description and example of flow interpretation possibilities.

perun show flow [OPTIONS] <aggregation_function>

Options

-o, --of <of_key>

Sets key that is source of the data for the flow, i.e. what will be displayed on Y axis, e.g. the amount of resources. [required]

-t, --through <through_key>

Sets key that is source of the data value, i.e. the independent variable, like e.g. snapshots or size of the structure.

-b, --by <by_key>

For each <by_resource_key> one graph will be output, e.g. for each subtype or for each location of resource. [required]

-s, --stacked

Will stack the y axis values for different <by> keys on top of each other. Additionaly shows the sum of the values.

--accumulate, --no-accumulate

Will accumulate the values for all previous values of X axis.

-ut, --use-terminal

Shows flow graph in the terminal using ncurses library.

-f, --filename <filename>

Sets the outputs for the graph to the file.

-xl, --x-axis-label <x_axis_label>

Sets the custom label on the X axis of the flow graph.

-yl, --y-axis-label <y_axis_label>

Sets the custom label on the Y axis of the flow graph.

-gt, --graph-title <graph_title>

Sets the custom title of the flow graph.

-v, --view-in-browser

The generated graph will be immediately opened in the browser (firefox will be used).

Arguments

<aggregation_function>

Optional argument

Examples of Output

_images/memory-flow.png

The Flow Plot above shows the mean of allocated amounts per each allocation site (i.e. uid) in stacked mode. The stacking of the means clearly shows, where the biggest allocations where made during the program run.

_images/complexity-flow.png

The Flow Plot above shows the trend of the average running time of the SLList_search function depending on the size of the structure we execute the search on.

Heap Map

Heap map is a visualization of underlying memory address map that links chunks of allocated memory to corresponding allocation sources. This is mostly for showing utilization of memory, where objects were allocated, how often, and how the objects are fragmented in the memory. Heap map visualization is interactive and is implemented using ncurses library.

Overview and Command Line Interface

perun show heapmap

Shows interactive map of memory allocations to concrete memories for each function.

  • Limitations: memory profiles generated by Memory Collector.
  • Interpretation style: textual
  • Visualization backend: ncurses

Heap map shows the underlying memory map, and links the concrete allocations to allocated addresses for each snapshot. The map is interactive, one can either play the full animation of the allocations through snapshots or move and explore the details of the map.

Moreover, the heap map contains heat map mode, which accumulates the allocations into the heat representation—the hotter the colour displayed at given memory cell, the more time it was allocated there.

The heap map aims at showing the fragmentation of the memory and possible differences between different allocation strategies. On the other hand, the heat mode aims at showing the bottlenecks of allocations.

Refer to Heap Map for more thorough description and example of heapmap interpretation possibilities.

perun show heapmap [OPTIONS]

Examples of Output

_images/memory-heapmap.png

The Heap Map shows the address space through the time (snapshots) and visualize the fragmentation of memory allocation per each allocation site. The heap map aboe shows the difference between allocations using lists (purple), skiplists (pinkish) and standard vectors (blue). The map itself is interactive and displays details about individual address cells.

_images/memory-heatmap.png

Heat map is a mode of heap map, which aggregates the allocations over all of the snapshots and uses warmer colours for address cells, where more allocations were performed.

Scatter Plot

Scatter plot visualizes the data as points on two dimensional grid, with moderate customization possibilities. This visualization also display regression models, if the input profile was postprocessed by Regression Analysis. The output backend of Scatter plot is Bokeh library.

Overview and Command Line Interface

perun show scatter

Interactive visualization of resources and models in scatter plot format.

Scatter plot shows resources as points according to the given parameters. The plot interprets <per> and <of> as x, y coordinates for the points. The scatter plot also displays models located in the profile as a curves/lines.

  • Limitations: none.
  • Interpretation style: graphical
  • Visualization backend: Bokeh

Features in progress:

  • uid filters
  • models filters
  • multiple graphs interpretation

Graphs are displayed using the Bokeh library and can be further customized by adding custom labels for axis, custom graph title and different graph width.

The example output of the scatter is as follows:

                          <graph_title>
                  `                         o
                  -                        /
                  `                       /o       ```````````````````
                  -                     _/         `  o o = <points> `
                  `                   _- o         `    _             `
    <of>          -               __--o            `  _-  = <models> `
                  `    _______--o- o               `                 `
                  -    o  o  o                     ```````````````````
                  `
                  +````||````||````||````||````

                              <per>

Refer to Scatter Plot for more thorough description and example of scatter interpretation possibilities. For more thorough explanation of regression analysis and models refer to Regression Analysis.

perun show scatter [OPTIONS]

Options

-o, --of <of_key>

Data source for the scatter plot, i.e. what will be displayed on Y axis. [default: amount]

-p, --per <per_key>

Keys that will be displayed on X axis of the scatter plot. [default: structure-unit-size]

-f, --filename <filename>

Outputs the graph to the file specified by filename.

-xl, --x-axis-label <x_axis_label>

Label on the X axis of the scatter plot.

-yl, --y-axis-label <y_axis_label>

Label on the Y axis of the scatter plot.

-gt, --graph-title <graph_title>

Title of the scatter plot.

-v, --view-in-browser

Will show the graph in browser.

Examples of Output

_images/complexity-scatter-with-models-full.png

The Scatter Plot above shows the interpreted models of different complexity example, computed using the full computation method. In the picture, one can see that the depedency of running time based on the structural size is best fitted by linear models.

_images/complexity-scatter-with-models-initial-guess.png

The next scatter plot displays the same data as previous, but regressed using the initial guess strategy. This strategy first does a computation of all models on small sample of data points. Such computation yields initial estimate of fitness of models (the initial sample is selected by random). The best fitted model is then chosen and fully computed on the rest of the data points.

The picture shows only one model, namely linear which was fully computed to best fit the given data points. The rest of the models had worse estimation and hence was not computed at all.

Creating your own Visualization

New interpretation modules can be registered within Perun in several steps. The visualization methods has the least requirements and only needs to work over the profiles w.r.t. Specification of Profile Format and implement method for Click api in order to be used from command line.

You can register your new visualization as follows:

  1. Run perun utils create view myview to generate a new modules in perun/view directory with the following structure. The command takes a predefined templates for new visualization techniques and creates __init__.py and run.py according to the supplied command line arguments (see Utility Commands for more information about interface of perun utils create command):

    /perun
    |-- /view
        |-- /myview
            |-- __init__.py
            |-- run.py
        |-- /bars
        |-- /flamegraph
        |-- /flow
        |-- /heapmap
        |-- /scatter
    
  2. First, implement the __init__.py file, including the module docstring with brief description of the visualization technique and definition of constants which has the following structure:

1
2
3
4
5
"""..."""

SUPPORTED_PROFILES = ['mixed|memory|mixed']

__author__ = 'You!'
  1. Next, in the run.py implement module with the command line interface function, named the same as your visualization technique. This function is called from the command line as perun show ``perun show myview and is based on Click library.
  2. Finally register your newly created module in get_supported_module_names() located in perun.utils.__init__.py:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
--- /mnt/f/phdwork/perun/gh-pages/docs/_static/templates/supported_module_names.py
+++ /mnt/f/phdwork/perun/gh-pages/docs/_static/templates/supported_module_names_views.py
@@ -8,5 +8,6 @@
         'vcs': ['git'],
         'collect': ['trace', 'memory', 'time'],
         'postprocess': ['filter', 'normalizer', 'regression-analysis'],
-        'view': ['alloclist', 'bars', 'flamegraph', 'flow', 'heapmap', 'raw', 'scatter']
+        'view': ['alloclist', 'bars', 'flamegraph', 'flow', 'heapmap', 'raw', 'scatter',
+                 'myview']
     }[package]
  1. Preferably, verify that registering did not break anything in the Perun and if you are not using the developer installation, then reinstall Perun:

    make test
    make install
    
  2. At this point you can start using your visualization either using perun show.

  3. If you think your collector could help others, please, consider making Pull Request.