cuda perfworks api: Update internal string handling#513
Conversation
|
I am reviewing this PR. |
|
I am still reviewing this PR. I encounter the following error message when running
on an NVIDIA GeForce RTX 5080 (Blackwell architecture), using CUDA 12.9. |
|
I did two rounds of testing these changes on the NVIDIA Hopper architecture using CUDA 13.0.0. For the first round of testing, I enabled the debug messages. This confirmed that the changes in this PR indeed resolve the reported error messages. For the second round of testing, I disabled the debug messages to more easily compare (between this feature branch and master) the output of the PAPI utilities (
This makes sense because these events were excluded from the event table due to the string truncation errors that this PR addresses. Additionally, the component tests behave as expected. |
dbarry9
left a comment
There was a problem hiding this comment.
@Treece-Burgess Can you please rebase this PR so that we can merge it? I have attached to this message what I believe src/components/cuda/cupti_profiler.c should be to resolve the merge conflict.
cupti_profiler.c
6d72bc0 to
c006d15
Compare
Pull Request Description
Issue:
Currently when using the master branch with the following configure
./configure --prefix=$PWD/test-install --with-components="cuda" --with-debug=yesand on a machine (Hopper1 at Oregon) with an NVIDIA GH200, the following will output when runningpapi_native_avail:This is due to a few of the CUPTI metrics having a total of 128 characters or more and us internally only copying 128 characters (
PAPI_MAX_STR_LEN). Which results in us internally chopping off the last few chars and the CUPTI call failing.This PR resolves this behavior.
Testing
Testing was done on Hopper1 at Oregon with the setup:
The PAPI utilities
papi_component_avail,papi_native_avail, andpapi_command_lineall ran successfully. Along with the utilities, the Cuda component tests all passed.Author Checklist
Why this PR exists. Reference all relevant information, including background, issues, test failures, etc
Commits are self contained and only do one thing
Commits have a header of the form:
module: short descriptionCommits have a body (whenever relevant) containing a detailed description of the addressed problem and its solution
The PR needs to pass all the tests