Skip to content

Conversation

@Andrew1326
Copy link
Contributor

Summary

Fixes the assertion failure crash:

nvtop: ./src/extract_gpuinfo_amdgpu.c:964: parse_drm_fdinfo_amd:
Assertion `!cache_entry_check && "We should not be processing a client id twice per update"' failed.
Aborted (core dumped)

Root Cause

The assertion fails when a process has multiple file descriptors referencing the same DRM client (e.g., via dup(), fork(), or DRM master operations). The kcmp syscall filters duplicate file descriptions but not distinct file descriptions that report the same underlying DRM client_id.

Fix

Converts the debug assertion into a runtime check that:

  • Detects duplicate client_id entries in current_update_process_cache
  • Gracefully skips duplicates instead of crashing
  • Frees newly allocated cache entries to prevent memory leaks

Files Changed

Applied fix to all affected drivers:

  • extract_gpuinfo_amdgpu.c - AMD GPUs
  • extract_gpuinfo_intel_i915.c - Intel i915
  • extract_gpuinfo_intel_xe.c - Intel Xe
  • extract_gpuinfo_msm.c - Qualcomm MSM (also fixed incorrect hash key: was using &cid instead of &cache_entry->client_id)
  • extract_gpuinfo_mali_common.c - ARM Mali

Testing

Tested on AMD GPU where the crash was occurring - nvtop now runs without crashing.

The assertion "We should not be processing a client id twice per update"
can fail when a process has multiple file descriptors referencing the
same DRM client (e.g., via dup(), fork(), or DRM master operations).

The kcmp syscall filters duplicate file descriptions but not distinct
file descriptions that report the same underlying DRM client_id.

This change converts the debug assertion into a runtime check that
gracefully skips duplicate entries and frees any newly allocated
cache entries to prevent memory leaks.

Fixes the crash:
  nvtop: ./src/extract_gpuinfo_amdgpu.c:964: parse_drm_fdinfo_amd:
  Assertion `!cache_entry_check && "We should not be processing a
  client id twice per update"' failed.

Applied to all affected drivers:
- AMDGPU
- Intel i915
- Intel Xe
- Qualcomm MSM (also fixed incorrect hash key usage)
- ARM Mali
@Andrew1326
Copy link
Contributor Author

Fixes #435

@Syllo
Copy link
Owner

Syllo commented Jan 16, 2026

Thanks a lot for looking into it

@Syllo Syllo merged commit c48d058 into Syllo:master Jan 16, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants