Skip to content

Conversation

@ZelboK
Copy link
Contributor

@ZelboK ZelboK commented Dec 3, 2025

  1. tests seem to pass locally running
export ROCP_TOOL_LIBRARIES=$(python -c "import triton._C.libproton as p; print(p.__file__)")
export LD_PRELOAD=/opt/rocm/lib/librocprofiler-sdk-roctx.so # details on why needed later
pytest third_party/proton/test/test_profile.py

I am not comfortable enough with the CI to know best way for figuring out how to incorporate this for these tests.

  1. Right now lazy loading isn't supported by rocprofiler-sdk. As a result, since we need to initialize it before hip does to properly intercept kernels, I am using ROCP_TOOL_LIBRARIES to get around that. PyTorch will initialize hip if we don't do it first. rocprofiler-sdk looks for this env variable as i understand it and then calls rocprofiler_configure(note for self: audit later).

  2. rocprofiler-sdk most importantly supports the ability to attach and detach at any point so you can minimize overhead by not having to profile all the time.

  3. proton-cli is the way to use the new library, as a consequence of 2. Lazy loading may be added in the future.

  4. We use rocprofiler-register to make our lives easier. Each library (HIP, ROCr) calls into the register and provide interception table. That's the mechanism widely accepted (as a standard) in ROCm stack.

  5. Pytorch seems to be using this /opt/rocm/lib/libroctx64.so but rocprofiler-sdk intercepts /opt/rocm/lib/librocprofiler-sdk-roctx.so. A few of the nvtx tests were failing

For this PR I think the ergonomics of the libraries we're loading will need to be looked at to make sure they're ideal.

NOTE: Stochastic sampling will be added in a future PR. The work is complete but it seems to work.

danial javady added 7 commits November 24, 2025 19:18
- Removed RocprofPCSampling from MetricKind enum
- Removed RocprofPCSamplingMetric class from Metric.h
- Removed RocprofPCSampling handling from TreeData.cpp
- Deleted RocprofPCSampling.h and RocprofPCSampling.cpp
@ZelboK ZelboK changed the title rRocprofsdk proton [AMD] refactor proton to use rocprofiler-sdk and deprecate roctracer Dec 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant