Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions thunder/dev_utils/nvtx_profile_transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,20 @@ def nvtx_pop_impl():


class NvtxProfileTransform(thunder.core.transforms.Transform):
"""A trace transform that adds NVTX profiling markers around computation operations.

This transform wraps each computation operation in the trace with NVTX range push/pop calls,
enabling fine-grained profiling of individual operations in tools like NVIDIA Nsight Systems.

Warning:
When the model is complex and the trace has a lot of symbols (for example, when fusion
executors are not being used), this transform might slow down the overall execution as
host-side latency will be increased.

This transform is intended for debug purposes; use it to debug execution and avoid
enabling it for production or performance benchmarking runs.
"""

def transform_trace_post_optimization(self, trace: Trace, **kwargs) -> Trace:
with Timer() as timer:
profile_trace = from_trace(trace)
Expand Down
Loading