|
| 1 | +# AMD_SMI Component |
| 2 | + |
| 3 | +The **AMD_SMI** (AMD System Management Interface) component exposes hardware |
| 4 | +management counters (and selected controls) for AMD GPUs — e.g., power usage, |
| 5 | +temperatures, clocks, PCIe link metrics, VRAM information, and RAS/ECC status — |
| 6 | +by querying the AMD SMI library at runtime (ROCm ≥ 6.4.0). |
| 7 | + |
| 8 | +> **Configure note.** When both `amd_smi` and `rocm_smi` are requested, |
| 9 | +> PAPI’s configure script now inspects the ROCm stack and enables only the |
| 10 | +> appropriate SMI backend. We select `amd_smi` for ROCm 6.4.0 and newer, and |
| 11 | +> keep `rocm_smi` for older releases. This cutoff is based on internal testing |
| 12 | +> that showed AMD SMI becoming stable and feature-complete beginning with ROCm |
| 13 | +> 6.4.0. |
| 14 | +
|
| 15 | +- [Environment Variables](#environment-variables) |
| 16 | +- [Enabling the AMD_SMI Component](#enabling-the-amd_smi-component) |
| 17 | + |
| 18 | +--- |
| 19 | + |
| 20 | +## Environment Variables |
| 21 | + |
| 22 | +For AMD_SMI, PAPI requires the environment variable `PAPI_AMDSMI_ROOT` to be set |
| 23 | +so that the AMD SMI shared library and headers can be found. This variable is |
| 24 | +required at both **compile** and **run** time. |
| 25 | + |
| 26 | +**Setting PAPI_AMDSMI_ROOT** |
| 27 | +Set `PAPI_AMDSMI_ROOT` to the top-level ROCm directory. For example: |
| 28 | + |
| 29 | + ```bash |
| 30 | + export PAPI_AMDSMI_ROOT=/opt/rocm-6.4.0 |
| 31 | + # or |
| 32 | + export PAPI_AMDSMI_ROOT=/opt/rocm |
| 33 | + ``` |
| 34 | + |
| 35 | +The directory specified by `PAPI_AMDSMI_ROOT` **must contain** the following |
| 36 | +subdirectories: |
| 37 | + |
| 38 | +- `PAPI_AMDSMI_ROOT/lib` (which should include the dynamic library `libamd_smi.so`) |
| 39 | +- `PAPI_AMDSMI_ROOT/include/amd_smi` (AMD SMI headers) |
| 40 | + |
| 41 | +If the library is not found or is not functional at runtime, the component will |
| 42 | +appear as "disabled" in `papi_component_avail`, with a message describing the |
| 43 | +problem (e.g., library not found). |
| 44 | + |
| 45 | +### Library search order |
| 46 | + |
| 47 | +At initialization the component constructs the full path |
| 48 | +`${PAPI_AMDSMI_ROOT}/lib/libamd_smi.so` and hands it to `dlopen(3)`. If the file |
| 49 | +is missing or unreadable the component is disabled immediately. Any additional |
| 50 | +dependencies that `libamd_smi.so` brings in are resolved by the platform loader |
| 51 | +using the standard order: |
| 52 | + |
| 53 | +- entries in `LD_LIBRARY_PATH` |
| 54 | +- rpaths encoded in the binary or library |
| 55 | +- system defaults such as `/etc/ld.so.conf`, `/usr/lib64`, `/lib64`, `/usr/lib`, |
| 56 | + and `/lib` |
| 57 | + |
| 58 | +Because the main shared object is loaded by absolute path, pointing |
| 59 | +`PAPI_AMDSMI_ROOT` at the directory tree that actually contains AMD SMI is the |
| 60 | +authoritative way to pick up non-standard installs. Symlinking |
| 61 | +`${PAPI_AMDSMI_ROOT}/lib/libamd_smi.so` to the desired copy also works. |
| 62 | + |
| 63 | +### Handling non-standard installations |
| 64 | + |
| 65 | +- **Modules or package managers** – environment modules (`module load rocm`), |
| 66 | + Spack, or distro packages typically extend `PATH`, `LD_LIBRARY_PATH`, and |
| 67 | + other variables for you. Set `PAPI_AMDSMI_ROOT` to the corresponding ROCm |
| 68 | + prefix exported by the tool (check with `printenv` or `spack location`). |
| 69 | +- **Bare installs** – if AMD SMI lives elsewhere, export |
| 70 | + `PAPI_AMDSMI_ROOT=/custom/prefix` so that `${PAPI_AMDSMI_ROOT}/lib` and |
| 71 | + `${PAPI_AMDSMI_ROOT}/include` resolve correctly. |
| 72 | +- **Dependent libraries** – when a vendor build puts required runtime libraries |
| 73 | + (e.g., HIP, ROCm math libs) outside the ROCm tree, append those directories to |
| 74 | + `LD_LIBRARY_PATH`, for example: |
| 75 | + |
| 76 | + ```bash |
| 77 | + export LD_LIBRARY_PATH="/usr/lib64:/opt/vendor-extra/lib:${LD_LIBRARY_PATH}" |
| 78 | + ``` |
| 79 | + |
| 80 | + Always append/prepend to the existing variable to avoid clobbering entries |
| 81 | + added by other packages. |
| 82 | + |
| 83 | +--- |
| 84 | + |
| 85 | +## Enabling the AMD_SMI Component |
| 86 | + |
| 87 | +To enable reading (and where supported, writing) of AMD_SMI counters, build |
| 88 | +PAPI with this component enabled. For example: |
| 89 | + |
| 90 | +```bash |
| 91 | +./configure --with-components="amd_smi" |
| 92 | +make |
| 93 | +``` |
| 94 | + |
| 95 | +You can verify availability with the utilities in `papi/src/utils/`: |
| 96 | + |
| 97 | +```bash |
| 98 | +papi_component_avail # shows enabled/disabled components |
| 99 | +papi_native_avail -i amd_smi # lists native events for this component |
| 100 | +``` |
| 101 | + |
| 102 | +After changing `PAPI_AMDSMI_ROOT` or related library paths, rerun make clobber && ./configure --with-components="amd_smi" before rebuilding so configure picks up the new locations. |
| 103 | + |
| 104 | +--- |
| 105 | + |
0 commit comments