Skip to content

cuda perfworks api: Remove CTC metrics from native event pool#514

Merged
Treece-Burgess merged 1 commit intoicl-utk-edu:masterfrom
Treece-Burgess:11-20-2025-cuda-remove-ctc-events
Jan 6, 2026
Merged

cuda perfworks api: Remove CTC metrics from native event pool#514
Treece-Burgess merged 1 commit intoicl-utk-edu:masterfrom
Treece-Burgess:11-20-2025-cuda-remove-ctc-events

Conversation

@Treece-Burgess
Copy link
Contributor

@Treece-Burgess Treece-Burgess commented Nov 21, 2025

Pull Request Description

Issue:
On NVIDIA GH200's you can find CTC events:


cuda:::CTC.TriageCompute.ctc__rx_bytes
cuda:::CTC.TriageCompute.ctc__rx_bytes.peak_sustained
cuda:::CTC.TriageCompute.ctc__rx_bytes.peak_sustained_active

These events can be successfully added, but are unable to be profiled:

[tburgess@hopper1 bin]$ ./papi_command_line cuda:::CTC.TriageCompute.ctc__rx_bytes

This utility lets you add events from the command line interface to see if they work.

Successfully added: cuda:::CTC.TriageCompute.ctc__rx_bytes

Error! PAPI_start

To successfully profile CTC events you must be using the CUPTI Range Profiling API. Currently the cuda component supports the Legacy APIs and the Perfworks Metrics API.

Due to this PR #458 added a note to the CTC metric's descriptions which read:


NOTE: The NVIDIA Perfworks API that the cuda component utilizes does not support profiling CTC metrics.

This PR will go ahead and remove the CTC events from the available Cuda native events on a GH200 due to:

  1. These events are unable to be profiled with our current API support and the note which is left by us is fairly hidden.
  2. The Cuda component test refactor which will select the first available event on a device if no event was provided by a user. In the case of the GH200 this first event is a CTC event and would result in the tests failing.
  3. Support for the CUPTI Range Profiling API will be added in the future.

Testing

Testing was done on Hopper1 at Oregon which has the setup:

  • CPU: ARM Neoverse V2

  • GPU: 1 * GH200

  • OS: RHEL 9.4

  • Cuda Toolkit 12.8.1

  • papi_component_avail: ✅

  • papi_native_avail: ✅ (does not output CTC events)

  • papi_command_line: ✅ (does not add CTC events)

  • cuda component tests: ✅

Author Checklist

  • Description
    Why this PR exists. Reference all relevant information, including background, issues, test failures, etc
  • Commits
    Commits are self contained and only do one thing
    Commits have a header of the form: module: short description
    Commits have a body (whenever relevant) containing a detailed description of the addressed problem and its solution
  • Tests
    The PR needs to pass all the tests

@Treece-Burgess Treece-Burgess added component-cuda PRs and Issues related to the cuda component status-ready-for-review PR is ready to be reviewed type-maintenance Update code to keep it compatible, secure, modern. labels Nov 21, 2025
@dbarry9
Copy link
Contributor

dbarry9 commented Dec 12, 2025

I am reviewing this PR.

@dbarry9
Copy link
Contributor

dbarry9 commented Jan 5, 2026

I have tested these changes on the NVIDIA Hopper architecture using CUDA 13.0.0. The only differences when running the utilities (papi_avail, papi_native_avail, papi_component_avail, papi_hardware_avail, and papi_mem_info) between this feature branch and master are the number of available native events (64820 in master vs 64550 in this feature branch) in the output of papi_component_avail. However, this is the exact number of events, which have names starting with "cuda:::[ctc|CTC]*" in the output of papi_native_avail from the master branch. These (and only these) events are not present in the output of papi_native_avail from this feature branch, but they are present in the output from the master branch.

The cuda component tests behave as expected.

Note that I did not test these changes on the NVIDIA Blackwell architecture because it does not contain CTC events.

@Treece-Burgess Can you please rebase this PR so that we can merge it?

@Treece-Burgess Treece-Burgess force-pushed the 11-20-2025-cuda-remove-ctc-events branch from 183cf61 to d7d7d02 Compare January 6, 2026 00:13
@Treece-Burgess Treece-Burgess merged commit 142d159 into icl-utk-edu:master Jan 6, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component-cuda PRs and Issues related to the cuda component status-ready-for-review PR is ready to be reviewed type-maintenance Update code to keep it compatible, secure, modern.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants