Skip to content

Commit d30e00c

Browse files
committed
Add ADR about adding support for Flame Graph in TSP and trace viewer
Signed-off-by: Bernd Hufmann <bernd.hufmann@ericsson.com>
1 parent a2c9c8b commit d30e00c

File tree

3 files changed

+132
-0
lines changed

3 files changed

+132
-0
lines changed
Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
# 13. Flame graph support
2+
3+
Date: 2025-06-12
4+
5+
## Status
6+
7+
Accepted
8+
9+
## Context
10+
11+
This ADR outlines the design of flame graph support in the TSP and the implementation client and server side. Flame graphs exist in classic Trace Compass in Eclipse. Any call stack analysis provides a flame graph. The following image shows the view to be supported in the TSP.
12+
13+
![current](0013/flame-graph-0.0.1.png)
14+
15+
Eclipse Trace Compass's flame graph data provider for LTTng UST shows also Linux kernel states if the experiment contains both LTTng UST and Kernel. This is domain specific, but it illustrates the potential of such a view.
16+
17+
![current](0013/flame-graph-kernel-0.0.1.png)
18+
19+
There are 2 possibilites to support the flame graph in the TSP.
20+
21+
### Solution 1: New flame graph data provider type
22+
23+
The view itself is very similar to data providers of type `TIME_GRAPH`. The main difference is that the x-axis is the duration and not time. The Trace Compass back-end currently has such a data provider available with the same API than `TIME_GRAPH` data providers. However, this data provider is not exposed to the server through the `getDescriptor()` method of the corresponing data provider factory. It will need a new type to be added to the `ProviderType` enum.
24+
25+
With this solution similar endpoints can be defined which have the advantage that the endpoints and data structures are known. Back-end filtering and highlighting would be supported out of the box. Assuming the new `Provider Type` is called `gantt` the following endpoints will be defined:
26+
27+
```
28+
POST /experiments/{expUUID}/outputs/{outputId}/styles
29+
Get the styles defined for this data provider
30+
31+
GET /experiments/{expUUID}/outputs/{outputId}/annotations
32+
Get list of annotation categories for this view
33+
34+
POST /experiments/{expUUID}/outputs/{outputId}/annotations
35+
Get list of annotation for a given annotation categories for this view
36+
37+
POST /experiments/{expUUID}/outputs/gantt/{outputId}/tree
38+
Returns the gantt chart entry model
39+
40+
POST /experiments/{expUUID}/outputs/gantt/{outputId}/states
41+
Returns the states from the for given entries
42+
43+
POST /experiments/{expUUID}/outputs/gantt/{outputId}/tooltip
44+
Get detailed information of a state, annotation or arrow
45+
```
46+
47+
**Note**:
48+
49+
- This solution will allow us to fetch data depending on the zoom level. Since the flame graph data is limited in size (in comparison to regular time graphs), all the data can fetched once and all the zooming is done in FE only.
50+
- The annotation endpoints and styles endpoint are independent of the chart type (`TIME_GRAPH`) and can be provided for any chart type.
51+
- The endpoints and data structures for the time based gantt chart (providerType `TIME_GRAPH`) are designed around timestamps. For example, states have a start and end, or annotations have a `start` and `duration`, or query parameter `requested_time_range` is meant for time stamps. Those names should be more generic, i.e. instead of `duration` it should be `delta`, instead of `requested_time_range` it should be `requested_range` and so on. Having generic names it won't be necessary to duplicate endpoints, data structures etc. for different x-axis.
52+
- To be able to get a flame graph for a given time range of the trace the `requested_time_range` should be used for that and a new query parameter `requested_range` for the view range (see bullet above).
53+
- Arrows for flame chart won't make make sense and don't need to be supported.
54+
55+
#### Visualization
56+
57+
To visualize the data `gantt` data in the theia-trace-extesion, a new react commponent should be implemented either using timeline-chart library (similar to the `TimegraphOutputComponent`). Alternatively, the new react component can implement or use an existing flame graph library, for example [d3-flame-graph](https://www.npmjs.com/package/d3-flame-graph/).
58+
59+
#### Example implementation
60+
61+
The Trace Compass mainline already has a flame graph data provider and data provider factory:
62+
63+
- [Flame Graph Data Provider Factory](https://github.com/eclipse-tracecompass/org.eclipse.tracecompass/blob/master/analysis/org.eclipse.tracecompass.analysis.profiling.core/src/org/eclipse/tracecompass/internal/analysis/profiling/core/flamegraph/FlameGraphDataProviderFactory.java)
64+
- It misses the implementation of `getDescriptors()`
65+
- [Flame Graph Data Provider](https://github.com/eclipse-tracecompass/org.eclipse.tracecompass/blob/master/analysis/org.eclipse.tracecompass.analysis.profiling.core/src/org/eclipse/tracecompass/internal/analysis/profiling/core/flamegraph/FlameGraphDataProvider.java)
66+
- It uses `requested_time_range` for x-axis query parameter
67+
- It uses query parameter `selection_range` to specify the time range in the range to calculate the flame graph. This is for the selection range flame graph. This one doesn't exist in the TSP.
68+
- It uses query parameter `group_by` to sepcify the grouping strategy (see chapter [Custom Aggregation](#custom-aggregation)).
69+
- It uses query parameter prefix for tooltip actions `actions` to construct an action (e.g. go to min) see chapter [Actions](#actions) for more details about it.
70+
71+
### Solution 2: New weighted tree data provider tpye
72+
73+
The data of the flame graph could be expressed as a weighted tree. In fact the flame graph data provider above uses an weighted tree implementation to serve the API methods. In this solution, a new weighted tree data provider type should be exposed in the TSP. To achieve that the `DATA_TREE` data provider type needs to be extend to describe which column contains the weight values. The flame graph can then be built by traversing the tree to fill the flame graph view.
74+
75+
Since data providers of `DATA_TREE` don't need any styles, the colors for the flame graph derived from the weighted tree are not defined. So, to have consistent colors to the `Flame Chart`, the data tree will need a `fetchStyles` endpoint which has to be added.
76+
77+
The existing data tree endpoints are not sufficient. New endpoints need to be added for tooltip etc. No support for back-end supported annotation or arrows.
78+
79+
```
80+
// Existing endpoints.
81+
POST /experiments/{expUUID}/outputs/data/{outputId}/tree
82+
Returns the gantt chart entry model
83+
84+
// End point exists, however each data provider needs to support this if applicable
85+
POST /experiments/{expUUID}/outputs/{outputId}/styles
86+
Get the styles defined for this data provider
87+
88+
// New endpoints
89+
POST /experiments/{expUUID}/outputs/data/{outputId}/tooltip
90+
Get detailed information of a row (or cell).
91+
```
92+
93+
**Notes**
94+
- The `DATA_TREE` endpoint will fetch the whole data in one query. Since the flame graph data is limited this won't be an issue
95+
- The `DATA_TREE` endpoint has timebased query parameter `requested_time_range` to fetch the flame graph for a specific time range.
96+
- Using the data tree endpoint it will be easier and more intuitive to define custom views for the same data. Annotations are not supported, however the generic endpoint for annotations (see solution 1) could be re-used. Applicable FE visualization can then show them where it make sense. An arrows (links) endpoint would not really make sense when using a data tree.
97+
- To get the tooltip of state, the FE state needs to have pointers to the data tree entry (row) to be able query the back-end
98+
99+
#### Visualization
100+
101+
To visualize the data weighted `DATA_TREE` data as flame graph in the theia-trace-extension, a new react commponent should be implemented either using timeline-chart library (similar to the `TimegraphOutputComponent`). Alternatively, the new react component can implement or use an existing flame graph library, for example [d3-flame-graph](https://www.npmjs.com/package/d3-flame-graph/.
102+
103+
### Other considerations
104+
105+
#### Custom Aggregation
106+
Flame graphs can have different schemes for aggregation of layers, e.g. per process ID, threads or full aggregation, which will change the tree and duration values of each segment. Changing the aggregation scheme needs to be part of the TSP protocol, i.e. the flame graph data provider needs to provide the available aggregation schemes and there needs to be a query parameter or path parameter to specify which one to apply in the back-end.
107+
108+
The data provider descriptor capabilities can be augmented to indicate if a data tree can change the aggregation scheme. With this in place we can define how the possible groupings are provided by the server to the client, and how the client provides the aggregation scheme.
109+
110+
Aggregation will have to be supported in either of the solutions above.
111+
112+
#### Weight metric
113+
114+
Please note that the weight metric of the flame graph and weighted tree in the description above is duration. However, the weight metric can be another metric, e.g. number of calls, bytes allocated, number of page faults. The solution should allow those use cases. To support that there needs to be a description of the metric used so that the visualization is adapted accordingly (e.g. show the correct units).
115+
116+
#### Actions
117+
118+
The Flame Graph in Trace Compass has built in statistics, that allows to navigate to the time range of the minimum or maximum duration in the trace. For solution one, the statistics are shown as part of the segment tooltip which is queried on demand. The min/max time range has to be encoded in the tooltip and the UI needs to be able to understand that this is an action. For that we need to define a action indicator (e.g. prefix #ACTION or similar).
119+
120+
For the data tree solution, the statistics could be provided as tooltip on the row as well. For that a new tooltip endpoint for data trees is required. The action design would be the same as in solution 1.
121+
122+
## Decision
123+
124+
Flame Graph support will be added. Re-using the flame graph data provider (solution 1) will allow client FE to leverage existing gantt chart view implementation. It also provides known API.
125+
126+
Implementing support of a weighted tree is also useful to have with or without implementing solution 1. It should be added in any case. Note, that the Trace Compass trace server back-end doesn't aready have such data provider. To create a flame graph from the weighted tree would put more processing to the FE client. However, configurations like aggregation scheme would only be needed to be done once.
127+
128+
The decision is to implement solution 1 for a dedicated endpoint for the flamecharts. The weighted tree support as part of the `DATA_TREE` endpoint will be implemented later.
129+
130+
## Consequences
131+
132+
Flame Graphs are important profiling views for analyzing performance of applications. Adding those will help developers find performance bottlenecks faster.

doc/adr/0013/flame-graph-0.0.1.png

84.3 KB
Loading
84.2 KB
Loading

0 commit comments

Comments
 (0)