-
Notifications
You must be signed in to change notification settings - Fork 180
Description
[Migrated from original issue] ROCm/rocm_smi_lib#220
Original issue author: @AngryLoki
Problem Description
Hi,
There is a small but annoying issue in rocm-smi, when it attempts to output stats about iGPU:
Exception caught: map::at: key not found
========================================== ROCm System Management Interface ==========================================
==================================================== Concise Info ====================================================
Device Node IDs Temp Power Partitions SCLK MCLK Fan Perf PwrCap VRAM% GPU%
(DID, GUID) (Edge) (Avg) (Mem, Compute, ID)
======================================================================================================================
0 1 0x744c, 57491 36.0°C 85.0W N/A, N/A, 0 422Mhz 96Mhz 0% auto 291.0W 0% 14%
1 2 0x164e, 16626 45.0°C 34.233W N/A, N/A, 0 N/A 2400Mhz 0% auto N/A 3% 0%
======================================================================================================================
================================================ End of ROCm SMI Log =================================================
I. e. with any options it outputs Exception caught: map::at: key not found. Here is why:
-
setVoltSensorLabelMapis populated for exactly one sensor --vddgfx. -
It enumerates
in#_labelfiles
for f in /sys/class/drm/renderD*/device/hwmon/hwmon*/in*_label; do echo "$f:"; cat "$f"; done
/sys/class/drm/renderD128/device/hwmon/hwmon4/in0_label:
vddgfx
/sys/class/drm/renderD129/device/hwmon/hwmon5/in0_label:
vddgfx
/sys/class/drm/renderD129/device/hwmon/hwmon5/in1_label:
vddnbwhere renderD128 is gfx1100 and renderD129 is gfx1036 (iGPU). Note: sensor 1 is vddnb voltage for the north bridge.
-
it tries to process
get_supported_sensors0 and 1 for iGPU
https://github.com/ROCm/rocm_smi_lib/blob/ff7561607ef47829940f87b140421fdb4934a0a0/src/rocm_smi_monitor.cc#L633-L634 -
getVoltSensorEnumthrowsmap::at: key not foundexception, as there is novddnb (1)voltage type inindex_volt_type_map_.
A simple solution that fixes the issue is to register vddnb voltage type. Please check the attached pull-request.
Operating System
Gentoo
CPU
GPU
gfx1036 + gfx1100
ROCm Version
ROCm 6.4.1
ROCm Component
rocm_smi_lib
Steps to Reproduce
No response
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response