Skip to content

Commit 24c2746

Browse files
authored
Minor editorial changes data type selection feature (#816)
1 parent 8099fd3 commit 24c2746

File tree

3 files changed

+11
-11
lines changed

3 files changed

+11
-11
lines changed

CHANGELOG.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,11 +25,11 @@ Full documentation for ROCm Compute Profiler is available at [https://rocm.docs.
2525
* -b option in profile mode also accept hardware IP block for filtering, however, this support will be deprecated soon
2626
* --list-metrics option added in profile mode to list possible metric id(s), similar to analyze mode
2727

28-
* Datatype selection option for roofline profiling
29-
* --roofline-data-type / -R option added to specify which datatypes the user wants to capture in the roofline PDF plot outputs
28+
* Data type selection option for roofline profiling
29+
* --roofline-data-type / -R option added to specify which data types the user wants to capture in the roofline PDF plot outputs
3030
* Default is FP32, but user can specify as many types as desired to overlay on the same plot output
3131

32-
* Additional datatypes for roofline profiling
32+
* Additional data types for roofline profiling
3333
* Now supports FP4, FP6, FP8, FP16, BF16, FP32, FP64, I8, I32, I64 (dependent on gpu architecture)
3434

3535
* Support host-trap PC Sampling on CLI (beta version)
@@ -40,7 +40,7 @@ Full documentation for ROCm Compute Profiler is available at [https://rocm.docs.
4040
* Scheduler-Pipe Wave Utilization
4141
* Scheduler FIFO Full Rate
4242
* CPC ADC Utilization
43-
* F6F4 datatype metrics
43+
* F6F4 data type metrics
4444
* Update formula for total FLOPs while taking into account F6F4 ops
4545
* LDS STORE, LDS LOAD, LDS ATOMIC instruction count metrics
4646
* LDS STORE, LDS LOAD, LDS ATOMIC bandwidth metrics

docs/how-to/analyze/standalone-gui.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -76,9 +76,9 @@ application's profiling data:
7676
#. Memory Chart Analysis
7777
#. Empirical Roofline Analysis
7878

79-
Use ``--roofline-data-type`` option to specify which datatype(s) you would like displayed on the roofline PDFs in the standalone analysis GUI.
80-
Datatypes can be stacked- for example, "--roofline-data-type FP32 FP64 I32" would display one PDF with FP32 and FP64 stacked, and one PDF with INT32.
81-
Default roofline datatype plotted is FP32.
79+
Use ``--roofline-data-type`` option to specify which data type(s) you would like displayed on the roofline PDFs in the standalone analysis GUI.
80+
Data types can be stacked- for example, "--roofline-data-type FP32 FP64 I32" would display one PDF with FP32 and FP64 stacked, and one PDF with INT32.
81+
Default roofline data type plotted is FP32.
8282

8383
#. Top Stats (Top Kernel Statistics)
8484
#. System Info

docs/how-to/profile/mode.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -197,7 +197,7 @@ an Instinct MI210 vs an Instinct MI250.
197197
``sysinfo.csv``, is created to reflect the target device settings. All
198198
profiling output is stored in ``log.txt``. Roofline-specific benchmark
199199
results are stored in ``roofline.csv`` and roofline plots are outputted into PDFs as
200-
``empirRoof_gpu-0_[datatype1]_..._[datatypeN].pdf`` where datatypes requested through
200+
``empirRoof_gpu-0_[datatype1]_..._[datatypeN].pdf`` where data types requested through
201201
``--roofline-data-type`` option are listed in the file name.
202202

203203
.. code-block:: shell-session
@@ -477,11 +477,11 @@ Roofline options
477477
running a roofline benchmark on your system.
478478

479479
``--roofline-data-type <datatype>``
480-
Allows you to specify datatypes that you want plotted in the roofline PDF output(s). Selecting more than one datatype will overlay the results onto the same plot. Default: FP32
480+
Allows you to specify data types that you want plotted in the roofline PDF output(s). Selecting more than one data type will overlay the results onto the same plot. Default: FP32
481481

482482
.. note::
483483

484-
For more information on datatypes supported based on the GPU architecture, see :doc:`../../conceptual/performance-model`
484+
For more information on data types supported based on the GPU architecture, see :doc:`../../conceptual/performance-model`
485485

486486
To distinguish different kernels in your ``.pdf`` roofline plot use
487487
``--kernel-names``. This will give each kernel a unique marker identifiable from
@@ -525,7 +525,7 @@ successfully.
525525
526526
.. note::
527527

528-
* ROCm Compute Profiler currently captures roofline profiling for all data types, but has the ability to reduce clutter in the PDF outputs by selecting datatype(s). Selecting multiple datatypes will overlay the results into the same PDF. If the user would like separate PDFs for each datatype off of the same workload run, the user can run the profiling command again with the single datatype as long as the roofline.csv still exists in the workload folder.
528+
* ROCm Compute Profiler currently captures roofline profiling for all data types, and you can reduce the clutter in the PDF outputs by filtering the data type(s). Selecting multiple data types will overlay the results into the same PDF. To generate results in separate PDFs for each data type from the same workload run, you can re-run the profiling command with each data type as long as the ``roofline.csv`` file still exists in the workload folder.
529529
* Roofline feature is currently not enabled on AMD Instinct MI350.
530530

531531
The following image is a sample ``empirRoof_gpu-0_FP32.pdf`` roofline

0 commit comments

Comments
 (0)