|
| 1 | +# 13. Flame graph support |
| 2 | + |
| 3 | +Date: 2025-06-12 |
| 4 | + |
| 5 | +## Status |
| 6 | + |
| 7 | +Accepted |
| 8 | + |
| 9 | +## Context |
| 10 | + |
| 11 | +This ADR outlines the design of flame graph support in the TSP and the implementation client and server side. Flame graphs exist in classic Trace Compass in Eclipse. Any call stack analysis provides a flame graph. The following image shows the view to be supported in the TSP. |
| 12 | + |
| 13 | + |
| 14 | + |
| 15 | +Eclipse Trace Compass's flame graph data provider for LTTng UST shows also Linux kernel states if the experiment contains both LTTng UST and Kernel. This is domain specific, but it illustrates the potential of such a view. |
| 16 | + |
| 17 | + |
| 18 | + |
| 19 | +There are 2 possibilites to support the flame graph in the TSP. |
| 20 | + |
| 21 | +### Solution 1: New flame graph data provider type |
| 22 | + |
| 23 | +The view itself is very similar to data providers of type `TIME_GRAPH`. The main difference is that the x-axis is the duration and not time. The Trace Compass back-end currently has such a data provider available with the same API than `TIME_GRAPH` data providers. However, this data provider is not exposed to the server through the `getDescriptor()` method of the corresponing data provider factory. It will need a new type to be added to the `ProviderType` enum. |
| 24 | + |
| 25 | +With this solution similar endpoints can be defined which have the advantage that the endpoints and data structures are known. Back-end filtering and highlighting would be supported out of the box. Assuming the new `Provider Type` is called `gantt` the following endpoints will be defined: |
| 26 | + |
| 27 | +``` |
| 28 | + POST /experiments/{expUUID}/outputs/{outputId}/styles |
| 29 | + Get the styles defined for this data provider |
| 30 | +
|
| 31 | + GET /experiments/{expUUID}/outputs/{outputId}/annotations |
| 32 | + Get list of annotation categories for this view |
| 33 | +
|
| 34 | + POST /experiments/{expUUID}/outputs/{outputId}/annotations |
| 35 | + Get list of annotation for a given annotation categories for this view |
| 36 | +
|
| 37 | + POST /experiments/{expUUID}/outputs/gantt/{outputId}/tree |
| 38 | + Returns the gantt chart entry model |
| 39 | +
|
| 40 | + POST /experiments/{expUUID}/outputs/gantt/{outputId}/states |
| 41 | + Returns the states from the for given entries |
| 42 | +
|
| 43 | + POST /experiments/{expUUID}/outputs/gantt/{outputId}/tooltip |
| 44 | + Get detailed information of a state, annotation or arrow |
| 45 | +``` |
| 46 | + |
| 47 | +**Note**: |
| 48 | + |
| 49 | +- This solution will allow us to fetch data depending on the zoom level. Since the flame graph data is limited in size (in comparison to regular time graphs), all the data can fetched once and all the zooming is done in FE only. |
| 50 | +- The annotation endpoints and styles endpoint are independent of the chart type (`TIME_GRAPH`) and can be provided for any chart type. |
| 51 | +- The endpoints and data structures for the time based gantt chart (providerType `TIME_GRAPH`) are designed around timestamps. For example, states have a start and end, or annotations have a `start` and `duration`, or query parameter `requested_time_range` is meant for time stamps. Those names should be more generic, i.e. instead of `duration` it should be `delta`, instead of `requested_time_range` it should be `requested_range` and so on. Having generic names it won't be necessary to duplicate endpoints, data structures etc. for different x-axis. |
| 52 | +- To be able to get a flame graph for a given time range of the trace the `requested_time_range` should be used for that and a new query parameter `requested_range` for the view range (see bullet above). |
| 53 | +- Arrows for flame chart won't make make sense and don't need to be supported. |
| 54 | + |
| 55 | +#### Visualization |
| 56 | + |
| 57 | +To visualize the data `gantt` data in the theia-trace-extesion, a new react commponent should be implemented either using timeline-chart library (similar to the `TimegraphOutputComponent`). Alternatively, the new react component can implement or use an existing flame graph library, for example [d3-flame-graph](https://www.npmjs.com/package/d3-flame-graph/). |
| 58 | + |
| 59 | +#### Example implementation |
| 60 | + |
| 61 | +The Trace Compass mainline already has a flame graph data provider and data provider factory: |
| 62 | + |
| 63 | +- [Flame Graph Data Provider Factory](https://github.com/eclipse-tracecompass/org.eclipse.tracecompass/blob/master/analysis/org.eclipse.tracecompass.analysis.profiling.core/src/org/eclipse/tracecompass/internal/analysis/profiling/core/flamegraph/FlameGraphDataProviderFactory.java) |
| 64 | + - It misses the implementation of `getDescriptors()` |
| 65 | +- [Flame Graph Data Provider](https://github.com/eclipse-tracecompass/org.eclipse.tracecompass/blob/master/analysis/org.eclipse.tracecompass.analysis.profiling.core/src/org/eclipse/tracecompass/internal/analysis/profiling/core/flamegraph/FlameGraphDataProvider.java) |
| 66 | + - It uses `requested_time_range` for x-axis query parameter |
| 67 | + - It uses query parameter `selection_range` to specify the time range in the range to calculate the flame graph. This is for the selection range flame graph. This one doesn't exist in the TSP. |
| 68 | + - It uses query parameter `group_by` to sepcify the grouping strategy (see chapter [Custom Aggregation](#custom-aggregation)). |
| 69 | + - It uses query parameter prefix for tooltip actions `actions` to construct an action (e.g. go to min) see chapter [Actions](#actions) for more details about it. |
| 70 | + |
| 71 | +### Solution 2: New weighted tree data provider tpye |
| 72 | + |
| 73 | +The data of the flame graph could be expressed as a weighted tree. In fact the flame graph data provider above uses an weighted tree implementation to serve the API methods. In this solution, a new weighted tree data provider type should be exposed in the TSP. To achieve that the `DATA_TREE` data provider type needs to be extend to describe which column contains the weight values. The flame graph can then be built by traversing the tree to fill the flame graph view. |
| 74 | + |
| 75 | +Since data providers of `DATA_TREE` don't need any styles, the colors for the flame graph derived from the weighted tree are not defined. So, to have consistent colors to the `Flame Chart`, the data tree will need a `fetchStyles` endpoint which has to be added. |
| 76 | + |
| 77 | +The existing data tree endpoints are not sufficient. New endpoints need to be added for tooltip etc. No support for back-end supported annotation or arrows. |
| 78 | + |
| 79 | +``` |
| 80 | + // Existing endpoints. |
| 81 | + POST /experiments/{expUUID}/outputs/data/{outputId}/tree |
| 82 | + Returns the gantt chart entry model |
| 83 | +
|
| 84 | + // End point exists, however each data provider needs to support this if applicable |
| 85 | + POST /experiments/{expUUID}/outputs/{outputId}/styles |
| 86 | + Get the styles defined for this data provider |
| 87 | +
|
| 88 | + // New endpoints |
| 89 | + POST /experiments/{expUUID}/outputs/data/{outputId}/tooltip |
| 90 | + Get detailed information of a row (or cell). |
| 91 | +``` |
| 92 | + |
| 93 | +**Notes** |
| 94 | +- The `DATA_TREE` endpoint will fetch the whole data in one query. Since the flame graph data is limited this won't be an issue |
| 95 | +- The `DATA_TREE` endpoint has timebased query parameter `requested_time_range` to fetch the flame graph for a specific time range. |
| 96 | +- Using the data tree endpoint it will be easier and more intuitive to define custom views for the same data. Annotations are not supported, however the generic endpoint for annotations (see solution 1) could be re-used. Applicable FE visualization can then show them where it make sense. An arrows (links) endpoint would not really make sense when using a data tree. |
| 97 | +- To get the tooltip of state, the FE state needs to have pointers to the data tree entry (row) to be able query the back-end |
| 98 | + |
| 99 | +#### Visualization |
| 100 | + |
| 101 | +To visualize the data weighted `DATA_TREE` data as flame graph in the theia-trace-extension, a new react commponent should be implemented either using timeline-chart library (similar to the `TimegraphOutputComponent`). Alternatively, the new react component can implement or use an existing flame graph library, for example [d3-flame-graph](https://www.npmjs.com/package/d3-flame-graph/. |
| 102 | + |
| 103 | +### Other considerations |
| 104 | + |
| 105 | +#### Custom Aggregation |
| 106 | +Flame graphs can have different schemes for aggregation of layers, e.g. per process ID, threads or full aggregation, which will change the tree and duration values of each segment. Changing the aggregation scheme needs to be part of the TSP protocol, i.e. the flame graph data provider needs to provide the available aggregation schemes and there needs to be a query parameter or path parameter to specify which one to apply in the back-end. |
| 107 | + |
| 108 | +The data provider descriptor capabilities can be augmented to indicate if a data tree can change the aggregation scheme. With this in place we can define how the possible groupings are provided by the server to the client, and how the client provides the aggregation scheme. |
| 109 | + |
| 110 | +Aggregation will have to be supported in either of the solutions above. |
| 111 | + |
| 112 | +#### Weight metric |
| 113 | + |
| 114 | +Please note that the weight metric of the flame graph and weighted tree in the description above is duration. However, the weight metric can be another metric, e.g. number of calls, bytes allocated, number of page faults. The solution should allow those use cases. To support that there needs to be a description of the metric used so that the visualization is adapted accordingly (e.g. show the correct units). |
| 115 | + |
| 116 | +#### Actions |
| 117 | + |
| 118 | +The Flame Graph in Trace Compass has built in statistics, that allows to navigate to the time range of the minimum or maximum duration in the trace. For solution one, the statistics are shown as part of the segment tooltip which is queried on demand. The min/max time range has to be encoded in the tooltip and the UI needs to be able to understand that this is an action. For that we need to define a action indicator (e.g. prefix #ACTION or similar). |
| 119 | + |
| 120 | +For the data tree solution, the statistics could be provided as tooltip on the row as well. For that a new tooltip endpoint for data trees is required. The action design would be the same as in solution 1. |
| 121 | + |
| 122 | +## Decision |
| 123 | + |
| 124 | +Flame Graph support will be added. Re-using the flame graph data provider (solution 1) will allow client FE to leverage existing gantt chart view implementation. It also provides known API. |
| 125 | + |
| 126 | +Implementing support of a weighted tree is also useful to have with or without implementing solution 1. It should be added in any case. Note, that the Trace Compass trace server back-end doesn't aready have such data provider. To create a flame graph from the weighted tree would put more processing to the FE client. However, configurations like aggregation scheme would only be needed to be done once. |
| 127 | + |
| 128 | +The decision is to implement solution 1 for a dedicated endpoint for the flamecharts. The weighted tree support as part of the `DATA_TREE` endpoint will be implemented later. |
| 129 | + |
| 130 | +## Consequences |
| 131 | + |
| 132 | +Flame Graphs are important profiling views for analyzing performance of applications. Adding those will help developers find performance bottlenecks faster. |
0 commit comments