Skip to content

Commit 5b3a82c

Browse files
prbasyal-amdyhuiYH
andauthored
[ROCm Compute Profiler] 710 update for Dynamic attach and Mono repo (… (#1635)
* [ROCm Compute Profiler] 710 update for Dynamic attach and Mono repo (#1631) * [rocprof-compute] Documentation changes for move to super-repo for 7.1 (#1329) - also remove json output mention in docs * Dynamic process attachment update * Index file updated --------- Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com> * Changelog updated --------- Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
1 parent e01eba9 commit 5b3a82c

File tree

4 files changed

+28
-20
lines changed

4 files changed

+28
-20
lines changed

CHANGELOG.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -6,33 +6,33 @@ Full documentation for ROCm Compute Profiler is available at [https://rocm.docs.
66

77
### Added
88

9-
* Improved standalone Roofline plots in profile mode (PDF output) and analyze mode (CLI and GUI visual plots):
10-
* Fixed the peak MFMA/VALU lines being cut off.
11-
* Cleaned up the overlapping roofline numeric values by moving them into the side legend.
12-
* Added AI points chart with respective values, cache level, and compute/memory bound status.
13-
* Added full kernel names to symbol chart.
14-
15-
* Add support for multi-kernel applications' pc sampling.
16-
* PC Sampling's outputs' instructions are displayed with the name of the kernel that individual instruction belongs to.
17-
* Single kernel selection is supported so that the pc samples of selected kernel can be displayed.
18-
9+
* Add support for PC sampling of multi-kernel applications.
10+
* PC Sampling output instructions are displayed with the name of the kernel that individual instruction belongs to.
11+
* Single kernel selection is supported so that the PC samples of selected kernel can be displayed.
1912

2013
### Changed
2114

2215
* Roofline analysis now runs on GPU 0 by default instead of all GPUs.
2316

2417
### Optimized
2518

26-
* Improved Roofline Benchmarking by updating the `flops_benchmark` calculation.
19+
* Improved roofline benchmarking by updating the `flops_benchmark` calculation.
20+
21+
* Improved standalone roofline plots in profile mode (PDF output) and analyze mode (CLI and GUI visual plots):
22+
* Fixed the peak MFMA/VALU lines being cut off.
23+
* Cleaned up the overlapping roofline numeric values by moving them into the side legend.
24+
* Added AI points chart with respective values, cache level, and compute/memory bound status.
25+
* Added full kernel names to symbol chart.
2726

2827
### Resolved issues
2928

30-
* Bugfixes for stability
29+
* Resolved existing issues to improve stability.
3130

3231
## ROCm Compute Profiler 3.3.0 for ROCm 7.1.0
3332

3433
### Added
35-
* Live attach/detach feature that allows coupling with a workload process, without controlling its start or end.
34+
35+
* Dynamic process attachment feature that allows coupling with a workload process, without controlling its start or end.
3636
* Use '--attach-pid' to specify the target process ID.
3737
* Use '--attach-duration-msec' to specify time duration.
3838

docs/how-to/live_attach_detach.rst

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
.. meta::
2-
:description: ROCm Compute Profiler: using Live Attach Detach
3-
:keywords: ROCm Compute Profiler, Attach Detach
2+
:description: Dynamic process attachment in ROCm Compute Profiler
3+
:keywords: ROCm Compute Profiler, Attach, Detach, Dynamic process attachment
44

55
***********************************************************
6-
Using Live Attach/Detach in ROCm Compute Profiler
6+
Dynamic process attachment in ROCm Compute Profiler
77
***********************************************************
88

9-
Live Attach/Detach is a new feature of ROCm Compute Profiler that allows coupling with a workload process, without controlling its start or end. The application can already be running before the profiler application is invoked. The profiler simply attaches to the process, collects the required counters, and then detaches—without altering the lifecycle of the workload.
9+
Dynamic process attachment is a new feature of ROCm Compute Profiler that allows coupling with a workload process, without controlling its start or end. The application can already be running before the profiler application is invoked. The profiler simply attaches to the process, collects the required counters, and then detaches—without altering the lifecycle of the workload.
1010

1111
A specific attach is not repeatable, and it can only collect the set of counters that the hardware is capable of capturing in a single run. As such, in the current implementation, you must specify a subset of counter groups that can be collected within one run. This can be done either by using the ``--block`` option (for example, --block 3.1.1 4.1.1 5.1.1) or by providing a predefined set through the use of single pass counter collection ``--set``.
1212

@@ -26,6 +26,7 @@ For using profiling options for PC sampling the configuration needed are:
2626
**Sample command:**
2727

2828
.. code-block:: shell
29+
2930
$ rocprof-compute profile -n try_live_attach_detach -b 3.1.1 4.1.1 5.1.1 --no-roof -VVV --attach-pid <process id of workload>
3031
3132
$ rocprof-compute profile -n try_live_attach_detach --set launch_stats --no-roof -VVV --attach-pid <process id of workload>
@@ -37,10 +38,11 @@ For using profiling options for PC sampling the configuration needed are:
3738
-----------------------
3839
Analysis options
3940
-----------------------
40-
The analyze options for attach/detach are completely compatible with the non-attach/detach option.
41+
42+
The analyze options for Dynamic process attachment are completely compatible with other non-Dynamic process attachment options.
4143

4244
.. note::
4345

44-
* Live Attach Detach feature is currently in BETA version. To enable Live/Attach Detach, you need to have the correct supported proper version of ROCprofiler-SDK and rocprofiler-register.
45-
* To make the Live Attach/Detach feature work, you must use "--block" or a single path to limit the number of counter input files to one. This limitation will be removed in a later version with implementations such as Iteration Multiplexing.
46+
* Dynamic process attachment feature is currently in BETA version. To enable Dynamic process attachment, you need to have the correct supported version of ROCprofiler-SDK and rocprofiler-register.
47+
* To make the Dynamic process attachment feature work, you must use "--block" or a single path to limit the number of counter input files to one. This limitation will be removed in a later version with implementations such as Iteration Multiplexing.
4648
* Due to the limitation of ROCprofiler-SDK, the attach can now only happen before Heterogeneous System Architecture (HSA) initialization. HSA initialization happens before the execution of the first HIP kernel call. It only happens once to save all the kernels' function signature, such as the function name and other launch parameters. Attaching after this stage misses all crucial information of the HIP kernel and makes it impossible to store the output. This limitation will be solved in later releases of ROCprofiler-SDK.

docs/index.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,11 @@ in practice.
4242

4343
* :doc:`how-to/use`
4444

45+
* :doc:`how-to/pc_sampling`
46+
4547
* :doc:`how-to/profile/mode`
48+
49+
* :doc:`how-to/live_attach_detach`
4650

4751
* :doc:`how-to/analyze/mode`
4852

docs/sphinx/_toc.yml.in

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,8 @@ subtrees:
2020
- file: how-to/use.rst
2121
- file: how-to/pc_sampling.rst
2222
title: Use PC sampling
23+
- file: how-to/live_attach_detach.rst
24+
title: Use Dynamic process attachment
2325
- file: how-to/profile/mode.rst
2426
- file: how-to/analyze/mode.rst
2527
entries:

0 commit comments

Comments
 (0)