Skip to content

Commit 8d709bc

Browse files
author
Baraldi, Giovanni
authored
SWDEV-513725: Update readme for gfx11+ power states (#193)
* Update readme * Update README.md Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com> * Address review comments * Update README.md Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com> --------- Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com> Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com> [ROCm/rocprofiler-sdk commit: 831e469]
1 parent 81250fa commit 8d709bc

File tree

1 file changed

+23
-7
lines changed

1 file changed

+23
-7
lines changed

projects/rocprofiler-sdk/README.md

Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -83,16 +83,32 @@ Please report in the Github Issues.
8383

8484
- Using PC sampling on multi-threaded applications might fail with `HSA_STATUS_ERROR_EXCEPTION`.Furthermore, if three or more threads launch operations to the same agent, and if PC sampling is enabled, the `HSA_STATUS_ERROR_EXCEPTION` might appear.
8585

86-
- Navi3x requires a stable power state for counter collection.
87-
Currently, this state needs to be set by the user.
88-
To do so, set "power_dpm_force_performance_level" to be writeable for non-root users, then set performance level to profile_standard:
89-
86+
- gfx11 and gfx12 requires a stable power state for counter collection. This includes Radeon 7000 GPUs.
9087
```bash
91-
sudo chmod 777 /sys/class/drm/card0/device/power_dpm_force_performance_level
92-
echo profile_standard >> /sys/class/drm/card0/device/power_dpm_force_performance_level
88+
# For device <N>. Use 'rocm-smi' or 'amd-smi monitor' to see device number.
89+
sudo amd-smi set -g <N> -l stable_std
90+
# After profiling, set power state back to 'auto'
91+
sudo amd-smi set -g <N> -l auto
9392
```
9493

95-
Recommended: "profile_standard" for counter collection and "auto" for all other profiling. Use rocm-smi to verify the current power state. For multiGPU systems (includes integrated graphics), replace "card0" by the desired card.
94+
The gfx version can be found via `amd-smi static --asic -g <N>` in the `TARGET_GRAPHICS_VERSION` field:
95+
96+
```bash
97+
$ amd-smi static -a -g 2
98+
GPU: 2
99+
ASIC:
100+
MARKET_NAME: Navi 33 [Radeon Pro W7500]
101+
VENDOR_ID: 0x1002
102+
VENDOR_NAME: Advanced Micro Devices Inc. [AMD/ATI]
103+
SUBVENDOR_ID: 0x1002
104+
DEVICE_ID: 0x7489
105+
SUBSYSTEM_ID: 0x0e0d
106+
REV_ID: 0x00
107+
ASIC_SERIAL: N/A
108+
OAM_ID: N/A
109+
NUM_COMPUTE_UNITS: 28
110+
TARGET_GRAPHICS_VERSION: gfx1102
111+
```
96112

97113
> [!WARNING]
98114
> The latest mainline version of AQLprofile can be found at [https://repo.radeon.com/rocm/misc/aqlprofile/](https://repo.radeon.com/rocm/misc/aqlprofile/). However, it's important to note that updates to the public AQLProfile may not occur as frequently as updates to the rocprofiler-sdk. This discrepancy could lead to a potential mismatch between the AQLprofile binary and the rocprofiler-sdk source.

0 commit comments

Comments
 (0)