Skip to content

Various Components: Avoid Mixed Memory Allocation#382

Merged
Treece-Burgess merged 1 commit intoicl-utk-edu:masterfrom
Treece-Burgess:05-16-2025-cmp-memory-allocation
Jun 2, 2025
Merged

Various Components: Avoid Mixed Memory Allocation#382
Treece-Burgess merged 1 commit intoicl-utk-edu:masterfrom
Treece-Burgess:05-16-2025-cmp-memory-allocation

Conversation

@Treece-Burgess
Copy link
Contributor

@Treece-Burgess Treece-Burgess commented May 20, 2025

Pull Request Description

This PR addresses components mixing C memory allocation and PAPI memory allocation. This becomes an issue when you add the flag --with-debug=memory as:

  1. PAPI memory allocation creates extra metadata
  2. Trying to papi_free a variable that was allocated with just malloc, calloc, or realloc will result in a segmentation fault

Below are Components that have changes done to them:

Component Memory Allocation Used Before Changes Memory Allocation Used After Changes Tests Ran Hardware Tested On Kernel Versioning
rocm_smi Mixed PAPI *papi_component_avail - ✅
*papi_native_avail - ✅
*papi_command_line - ✅
*Application Code with PAPI_shutdown - ✅
2 * MI210s ROCm 6.4.0
rocm Mixed PAPI *papi_component_avail - ✅
*papi_native_avail - ✅
*papi_command_line - ✅
*Application Code with PAPI_shutdown - ✅
2 * MI210s ROCm 6.4.0
coretemp Mixed PAPI *papi_component_avail - ✅
*papi_native_avail - ✅
*papi_command_line - ✅
*Application Code with PAPI_shutdown - ✅
AMD EPYC 7413 4.18.0-553.16.1.el8_10.x86_64
cuda Mixed C *papi_component_avail - ✅
*papi_native_avail - ✅
*papi_command_line - ✅
*Application Code with PAPI_shutdown - ✅
1 * H100 Cuda Toolkit 12.6.3
infiniband Mixed PAPI *papi_component_avail - ✅
*papi_native_avail - ✅
*papi_command_line - ✅
*Application Code with PAPI_shutdown - ✅
Intel Xeon Gold 6140 6.1.129-1.el9.elrepo.x86_64

Note that the Cuda component did not change to PAPI memory allocation as when testing the runtime was extremely long in --with-debug=memory to the point the utilities looked like they hang.

Author Checklist

  • Description
    Why this PR exists. Reference all relevant information, including background, issues, test failures, etc
  • Commits
    Commits are self contained and only do one thing
    Commits have a header of the form: module: short description
    Commits have a body (whenever relevant) containing a detailed description of the addressed problem and its solution
  • Tests
    The PR needs to pass all the tests

@tokey-tahmid
Copy link

I am testing this PR.

@Treece-Burgess Treece-Burgess force-pushed the 05-16-2025-cmp-memory-allocation branch from 4851aa9 to 4538ef4 Compare May 29, 2025 14:38
Copy link

@tokey-tahmid tokey-tahmid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on Oregon's Gilgamesh (AMD EPYC 7413, 1 * H100, 2 * MI200) with CUDA toolkit 12.8 and ROCM 6.4.0

All PAPI utilities and tests perform as expected.

…ation to avoid possible segmentation faults.
@Treece-Burgess Treece-Burgess force-pushed the 05-16-2025-cmp-memory-allocation branch from 4538ef4 to 8837701 Compare June 2, 2025 21:47
@Treece-Burgess Treece-Burgess merged commit f2609ca into icl-utk-edu:master Jun 2, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status-ready-for-review PR is ready to be reviewed type-bug Issues discussing bugs or PRs fixing bugs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants