File tree Expand file tree Collapse file tree 3 files changed +54
-4
lines changed
projects/rocprofiler-compute
src/rocprof_compute_tui/utils Expand file tree Collapse file tree 3 files changed +54
-4
lines changed Original file line number Diff line number Diff line change @@ -47,6 +47,50 @@ Full documentation for ROCm Compute Profiler is available at [https://rocm.docs.
4747 * sL1D-L2 BW Utilization (section 1401)
4848 * Bandwidth Utilization (section 1601)
4949
50+ * Update ` System Speed-of-Light ` panel to ` GPU Speed-of-Light ` in TUI with the following metrics:
51+ * Theoretical LDS Bandwidth
52+ * vL1D Cache BW
53+ * L2 Cache BW
54+ * L2-Fabric Read BW
55+ * L2-Fabric Write BW
56+ * Kernel Time
57+ * Kernel Time (Cycles)
58+ * SIMD Utilization
59+ * Clock Rate
60+
61+ * Add ` Compute Throughput ` panel to TUI with the following metrics:
62+ * VALU FLOPs
63+ * VALU IOPs
64+ * MFMA FLOPs (F8)
65+ * MFMA FLOPs (BF16)
66+ * MFMA FLOPs (F16)
67+ * MFMA FLOPs (F32)
68+ * MFMA FLOPs (F64)
69+ * MFMA FLOPs (F6F4) (in gfx950)
70+ * MFMA IOPs (Int8)
71+ * SALU Utilization
72+ * VALU Utilization
73+ * MFMA Utilization
74+ * VMEM Utilization
75+ * Branch Utilization
76+ * IPC
77+
78+ * Add ` Memory Throughput ` panel to TUI with the following metrics:
79+ * vL1D Cache BW
80+ * vL1D Cache Utilization
81+ * Theoretical LDS Bandwidth
82+ * LDS Utilization
83+ * L2 Cache BW
84+ * L2 Cache Utilization
85+ * L2-Fabric Read BW
86+ * L2-Fabric Write BW
87+ * sL1D Cache BW
88+ * L1I BW
89+ * Address Processing Unit Busy
90+ * Data-Return Busy
91+ * L1I-L2 Bandwidth
92+ * sL1D-L2 BW
93+
5094### Resolved issues
5195
5296* Fixed not detecting memory clock issue when using amd-smi
Original file line number Diff line number Diff line change 22# NOTE: This is used as a TUI-only yaml file for the beta release of the new performance metric organization
33Panel Config :
44 id : 3200
5- title : System Speed-of-Light
5+ title : GPU Speed-of-Light
66 metrics_description :
77 Theoretical LDS Bandwidth : Indicates the maximum amount of bytes that could have
88 been loaded from, stored to, or atomically updated in the LDS per unit time
@@ -37,7 +37,7 @@ Panel Config:
3737 data source :
3838 - metric_table :
3939 id : 3201
40- title : System Speed-of-Light
40+ title : GPU Speed-of-Light
4141 header :
4242 metric : Metric
4343 value : Avg
Original file line number Diff line number Diff line change @@ -7,8 +7,14 @@ sections:
77 collapsed : true
88 class : " sysinfo-section"
99 subsections :
10- - title : " System Speed-of-Light"
11- data_path : ["2. System Speed-of-Light", "2.1 System Speed-of-Light"]
10+ - title : " GPU Speed-of-Light"
11+ data_path : ["32. GPU Speed-of-Light", "32.1 GPU Speed-of-Light"]
12+ collapsed : true
13+ - title : " Compute Throughput"
14+ data_path : ["33. Compute Throughput", "33.1 Compute Throughput"]
15+ collapsed : true
16+ - title : " Memory Throughput"
17+ data_path : ["34. Memory Throughput", "34.1 Memory Throughput"]
1218 collapsed : true
1319 - title : " Memory Chart"
1420 data_path : ["3. Memory Chart", "3.1 Memory Chart"]
You can’t perform that action at this time.
0 commit comments