Skip to content

Commit 11888f9

Browse files
authored
swdev-515336: removing deprecated features from documentation (#12)
[ROCm/rocprofiler commit: 1fcb951]
1 parent 43c0738 commit 11888f9

File tree

3 files changed

+3
-154
lines changed

3 files changed

+3
-154
lines changed

projects/rocprofiler/README.md

Lines changed: 1 addition & 119 deletions
Original file line numberDiff line numberDiff line change
@@ -369,121 +369,6 @@ Usage:
369369
rocprofv2 --plugin json --hip-trace -d output_dir <app_relative_path>
370370
```
371371
372-
- ATT (Advanced thread tracer) plugin: advanced hardware traces data in binary format. Please refer ATT section.
373-
Tool used to collect fine-grained hardware metrics. Provides ISA-level instruction hotspot analysis via hardware tracing.
374-
375-
- Install plugin package. See Plugin Support section for installation
376-
- Run the following to view the trace. Att-specific options must come right after the assembly file.
377-
- On ROCm 6.0, ATT enables automatic capture of the ISA during kernel execution, and does not require recompiling. It is recommended to leave at "auto".
378-
379-
```bash
380-
rocprofv2 -i input.txt --plugin att auto --mode csv <app_relative_path>
381-
# Or using a user-supplied ISA:
382-
# rocprofv2 -i input.txt --plugin att <app_assembly_file> --mode csv <app_relative_path>
383-
```
384-
385-
- app_relative_path
386-
Path for the running application
387-
- ATT plugin optional parameters
388-
- --att_kernel "filename": Kernel filename(s) (glob) to use. A CSV file (or UI folder) will be generated for each kernel.txt file. Default: all in current folder.
389-
- --mode [csv, file, off (default)]
390-
- off
391-
Runs trace collection but not analysis, so it can be analyzed at a later time. Run rocprofv2 ATT with the same parameters (+ --mode csv), removing the application binary, to analyze previously generated traces.
392-
- csv
393-
Dumps the analyzed assembly into a CSV format, with the hitcount and total cycles cost. Recommended mode for most users.
394-
- file (deprecated)
395-
Dumps the analyzed json files to disk for viewing at a later time. Run python3 httpserver.py from within the generated name_ui/ folder to view the trace. The folder can be copied to another machine, and will run without rocm.
396-
- file,csv
397-
Both options can be used at the same time, generating a UI folder and a .csv.
398-
- network [removed]
399-
Network mode was removed, since it's functionality is included in file mode with the httpserver.py script generated inside the UI folder.
400-
- input.txt
401-
Required. Used to select specific compute units and other trace parameters.
402-
For first time users, using the following input file:
403-
404-
```bash
405-
# vectoradd
406-
att: TARGET_CU=1
407-
SE_MASK=0x1
408-
SIMD_SELECT=0x3
409-
```
410-
411-
```bash
412-
# histogram
413-
att: TARGET_CU=0
414-
SE_MASK=0xFF
415-
SIMD_SELECT=0xF // 0xF for GFX9, SIMD_SELECT=0 for Navi
416-
```
417-
418-
Possible contents:
419-
- att: TARGET_CU=1 // or some other CU [0,15] - WGP for Navi [0,8]
420-
- SE_MASK=0x1 // bitmask of shader engines. The fewer, the easier on the hardware. Default enables 1 out of 4 shader engines.
421-
- SIMD_SELECT=0xF // GFX9: bitmask of SIMDs. Navi: SIMD Index [0-3]. Recommended 0xF for GFX9 and 0x0 for Navi.
422-
- DISPATCH=ID // collect trace only for the given dispatch_ID. Multiple lines for can be added.
423-
- DISPATCH=ID,RN // collect trace only for the given dispatch_ID and MPI rank RN. Multiple lines with varying combinations of RN and ID can be added.
424-
- KERNEL=kernname // Profile only kernels containing the string kernname (c++ mangled name). Multiple lines can be added.
425-
- PERFCOUNTERS_CTRL=0x3 // Multiplier period for counter collection [0~31]. 0=fastest. GFX9 only.
426-
- PERFCOUNTER_MASK=0xFFF // Bitmask for perfcounter collection. GFX9 only.
427-
- PERFCOUNTER=counter_name // Add a SQ counter to be collected with ATT; period defined by PERFCOUNTERS_CTRL. GFX9 only.
428-
- BUFFER_SIZE=[size] // Sets size of the ATT buffer collection, per dispatch, in megabytes (shared among all shader engines).
429-
- ISA_CAPTURE_MODE=[0,1,2] // Set codeobj capture mode during kernel dispatch.
430-
- 0 = capture symbols only.
431-
- 1 = capture symbols for file:// and make a copy of memory://, dump captured copy as .out file.
432-
- 2 = Copy file:// and memory://, dump copied codeobj as .out files.
433-
- DISPATCH_RANGE=[begin],[end] // Continuously collect ATT data starting at "begin" and stop at "end". Alternative to DISPATCH= and KERNEL=.
434-
- By default, kernel names are truncated for ATT. To disable, please see the kernel name truncation section below.
435-
436-
- Example for vectoradd.
437-
438-
```bash
439-
# -g adds debugging symbols to the binary. Required only for tracking disassembly back to c++.
440-
hipcc -g vectoradd_hip.cpp -o vectoradd_hip.exe
441-
# "auto" means to use the automatically captured ISA, e.g. vectoradd_float_v0_isa.s dumped along with .att files.
442-
# "--mode csv" dumps the result to "att_output_vectoradd_float_v0.csv".
443-
rocprofv2 -i input.txt --plugin att auto --mode csv ./vectoradd_hip.exe
444-
```
445-
```bash
446-
# Alternatively, using --save-temps to generate the ISA
447-
hipcc -g --save-temps vectoradd_hip.cpp -o vectoradd_hip.exe
448-
# Replace "auto" with <generated_gpu_isa.s> for user-supplied ISA. Typically they match the wildcards *amdgcn-amd-amdhsa*.s.
449-
# Special attention to the correct architecture for the ISA, such as "gfx1100" (navi31).
450-
rocprofv2 -i input.txt --plugin att vectoradd_hip-hip-amdgcn-amd-amdhsa-gfx1100.s --mode csv ./vectoradd_hip.exe
451-
```
452-
453-
Instruction latencies will be in att_output_vectoradd_float_v0.csv
454-
455-
```bash
456-
# Use -d option to specify the generated data directory, and -o to specify dir and filename is the csv:
457-
rocprofv2 -d mydir -o test/mycsv -i input.txt --plugin att auto --mode csv ./vectoradd_hip.exe
458-
# Generates raw files inside mydir/ and the parsed data on test/mycsv_vectoradd_float_v0.csv
459-
```
460-
461-
***
462-
Note: For MPI or long running applications, we recommend to run collection, and later run the parser with already collected data:
463-
Run only collection: The assembly file is not used. Use mpirun [...] rocprofv2 [...] if needed.
464-
465-
```bash
466-
# Run only data collection, not the parser
467-
rocprofv2 -i input.txt --plugin att auto --mode off ./vectoradd_hip.exe
468-
```
469-
470-
Remove the binary/application from the command line.
471-
472-
```bash
473-
# Only runs the parser on previously collected data.
474-
rocprofv2 -i input.txt --plugin att auto --mode csv
475-
```
476-
477-
Note 2: By default, ATT only collects a SINGLE kernel dispatch for the whole application, which is the first dispatch matching the given filters (DISPATCH=<id> or KERNEL=<name>). To collect multiple dispatches in a single application run, use:
478-
479-
```bash
480-
export ROCPROFILER_MAX_ATT_PROFILES=<max_collections>
481-
```
482-
483-
Or, alternatively, use the continuous ATT mode (DISPATCH_RANGE parameter).
484-
485-
***
486-
487372
### Flush Interval
488373
489374
Flush interval can be used to control the interval time in milliseconds between the buffers flush for the tool. However, if the buffers are full the flush will be called on its own. This can be used as in the next example:
@@ -656,9 +541,8 @@ samples can be run as independent executables once installed
656541
- plugin
657542
- file: File Plugin
658543
- perfetto: Perfetto Plugin
659-
- att: Advanced thread tracer Plugin
660544
- ctf: CTF Plugin
661-
- samples: Samples of how to use the API, and also input.txt input file samples for counter collection and ATT.
545+
- samples: Samples of how to use the API, and also input.txt input file samples for counter collection.
662546
- script: Scripts needed for tracing
663547
- src: Source files of the project
664548
- api: API implementation for rocprofv2
@@ -673,8 +557,6 @@ samples can be run as independent executables once installed
673557
- filter: Type of profiling or tracing and its properties
674558
- tracer: Tracing support of the session
675559
- profiler: Profiling support of the session
676-
- spm: SPM support of the session
677-
- att: ATT support of the session
678560
- tools: Tools needed to run profiling and tracing
679561
- rocsys: Controlling Session from another CLI
680562
- utils: Utilities needed by the project

projects/rocprofiler/bin/rocprofv2

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -55,10 +55,8 @@ usage() {
5555
echo -e "${GREEN}--kernel-trace ${RESET} For Collecting Kernel dispatch Traces"
5656
echo -e "${GREEN}--sys-trace ${RESET} For Collecting HIP and HSA APIs and their Activities Traces along ROCTX\n"
5757
echo -e "\t#${GREY}usage e.g: rocprofv2 --[hip-trace|hsa-trace|roctx-trace|kernel-trace|sys-trace] <executable>\n"${RESET}
58-
echo -e "${GREEN}--plugin ${RESET} PLUGIN_NAME For enabling a plugin (cli/file/perfetto/att/ctf/speedscope)"
58+
echo -e "${GREEN}--plugin ${RESET} PLUGIN_NAME For enabling a plugin (cli/file/perfetto/ctf/speedscope)"
5959
echo -e "\t#${GREY} usage(file/perfetto/ctf) e.g: rocprofv2 -i pmc.txt --plugin [file/perfetto/ctf/json] -d out_dir <executable>"
60-
echo -e "\t# usage(att): rocprofv2 <rocprofv2_params> --plugin att <ISA_file> <att_parameters> <executable>"
61-
echo -e "\t# use \"rocprofv2 --plugin att --help\" for ATT-specific parameters help.${RESET}\n"
6260
echo -e "\t# use \"rocprofv2 --plugin json --disable-json-data-flows ...\" for SpeedScope support as speedscope doesn't support data flows.${RESET}\n"
6361
echo -e "${GREEN}--plugin-version ${RESET} <1|2> For selecting the version for the plugin (1/2)"
6462
echo -e "\t#${GREY} 1 - Legacy output format, 2 - New output format (default)${RESET}\n"

projects/rocprofiler/doc/rocprofv2_tool.md

Lines changed: 1 addition & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -145,39 +145,10 @@ The user has two options for building:
145145
rocprofv2 --help
146146
```
147147

148-
- (ATT) Advanced Thread Trace: It can collect kernel running time, granular hardware metrics per kernel dispatch and provide hotspot analysis at source code level via hardware tracing.
149-
Usage:
150-
```bash
151-
# ATT(Advanced Thread Trace) needs few preconditions before running.
152-
# 1. Make sure to generate the assembly file for application by executing the following before compiling your HIP Application
153-
export HIPCC_COMPILE_FLAGS_APPEND="--save-temps -g"
154-
155-
# 2. Install plugin package
156-
see Plugin Support section for installation
157-
158-
# 3. Run the following to view the trace
159-
rocprofv2 --plugin att <app_relative_path_assembly_file> --mode <network, file, off> -i input.txt <app_relative_path>
160-
161-
# app_assembly_file_relative_path is the assembly file with .s extension generated in 1st step
162-
# app_relative_path is the path for the application binary
163-
# Mode:
164-
# - Network: opens the server with the browser UI.
165-
# att needs 2 ports opened (8000, 18000), In case the browser is running on a different machine.
166-
# - File: dumps the json files to disk, it can be used to quickly verify if there is anything wrong with the data.
167-
# - Off runs collection but not analysis/parsing. So it can be later used on another system to be viewed.
168-
# input.txt gives flexibility to to target the compute unit and provide filters.
169-
# input.txt contents:
170-
# TARGET_CU=1 // or some other CU [0,15]
171-
# SE_MASK=0x1 // bitmask of shader engines. The fewer, the easier on the hardware. Default enables all 24 because SE_MASK code is recent.
172-
# SIMD_MASK=0xF // bitmask of SIMDs, there are four in GFX9.
173-
# samples/att.txt is having an example on how to right input file for ATT
174-
```
175-
176148
- Plugin Support: We have a template for adding new plugins. New plugins can be written on top of rocprofv2 to support the desired output format using include/rocprofiler/v2/rocprofiler_plugins.h header file. These plugins are modular in nature and can easily be decoupled from the code based on need. E.g.
177149
- file plugin: outputs the data in txt files.
178150
- Perfetto plugin: outputs the data in protobuf format.
179151
- Protobuf files can be viewed using ui.perfetto.dev or using trace_processor
180-
- ATT (Advanced thread tracer) plugin: advanced hardware traces data in binary format. Please refer ATT section.
181152
- CTF plugin: Outputs the data in ctf format(a binary trace format)
182153
- CTF binary output can be viewed using TraceCompass or babeltrace.
183154

@@ -267,9 +238,8 @@ samples can be run as independent executables once installed
267238
- plugin
268239
- file: File Plugin
269240
- perfetto: Perfetto Plugin
270-
- att: Advanced thread tracer Plugin
271241
- ctf: CTF Plugin
272-
- samples: Samples of how to use the API, and also input.txt input file samples for counter collection and ATT.
242+
- samples: Samples of how to use the API, and also input.txt input file samples for counter collection.
273243
- script: Scripts needed for tracing
274244
- src: Source files of the project
275245
- api: API implementation for rocprofv2
@@ -285,7 +255,6 @@ samples can be run as independent executables once installed
285255
- tracer: Tracing support of the session
286256
- profiler: Profiling support of the session
287257
- spm: SPM support of the session
288-
- att: ATT support of the session
289258
- tools: Tools needed to run profiling and tracing
290259
- rocsys: Controlling Session from another CLI
291260
- utils: Utilities needed by the project

0 commit comments

Comments
 (0)