Skip to content

Commit 3e4982b

Browse files
authored
Update gpuDebugging.md
1 parent c7c04bd commit 3e4982b

File tree

1 file changed

+17
-16
lines changed

1 file changed

+17
-16
lines changed

docs/documentation/gpuDebugging.md

Lines changed: 17 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -6,19 +6,19 @@
66
```bash
77
OMP_DISPLAY_ENV=true | false | verbose
88
```
9-
- Prints out the internal control values and environment variables at beginning of program if `true` or `verbose`
9+
- Prints out the internal control values and environment variables at the beginning of the program if `true` or `verbose`
1010
- `verbose` will also print out vendor-specific internal control values and environment variables
1111

1212
```bash
1313
OMP_TARGET_OFFLOAD = MANDATORY | DISABLED | DEFAULT
1414
```
15-
- Quick way to turn off off-load (DISABLED) or make it abort if a GPU isn't found (MANDATORY)
16-
- great first test: does the problem disappear when you drop back to the CPU?
15+
- Quick way to turn off off-load (`DISABLED`) or make it abort if a GPU isn't found (`MANDATORY`)
16+
- Great first test: does the problem disappear when you drop back to the CPU?
1717

1818
```bash
1919
OMP_THREAD_LIMIT=<positive_integer>
2020
```
21-
- Sets maximum number of OpenMP threads to use in a contention group
21+
- Sets the maximum number of OpenMP threads to use in a contention group
2222
- Might be useful in checking for issues with contention or race conditions
2323

2424
```bash
@@ -33,7 +33,7 @@ OMP_DISPLAY_AFFINITY=TRUE
3333
```bash
3434
CRAY_ACC_DEBUG: 0 (off), 1, 2, 3 (very noisy)
3535
```
36-
- Dumps a time-stamped log line ("ACC: ) for every allocation, data transfer, kernel launch, wait, etc. Great first stop when "nothing seems to run on the GPU.
36+
- Dumps a time-stamped log line (`"ACC: ...`) for every allocation, data transfer, kernel launch, wait, etc. Great first stop when "nothing seems to run on the GPU.
3737

3838
- Outputs on STDERR by default. Can be changed by setting `CRAY_ACC_DEBUG_FILE`.
3939
- Recognizes `stderr`, `stdout`, and `process`.
@@ -45,16 +45,16 @@ CRAY_ACC_DEBUG: 0 (off), 1, 2, 3 (very noisy)
4545
CRAY_ACC_FORCE_EARLY_INIT=1
4646
```
4747
- Force full GPU initialization at program start so you can see start-up hangs immediately
48-
- Default behavior without environment variable is to defer initialization on first use
49-
- Device initialization includes initializing the GPU vendor’s low-level device runtime library (e.g., libcuda for NVIDIA GPUs) and establishing all necessary software contexts for interacting with the device
48+
- Default behavior without an environment variable is to defer initialization on first use
49+
- Device initialization includes initializing the GPU vendor’s low-level device runtime library (e.g., libcuda for NVIDIA GPUs) and establishing all necessary software contexts for interacting with the device
5050

5151
### Cray OpenACC Options
5252

5353
```bash
5454
CRAY_ACC_PRESENT_DUMP_SAVE_NAMES=1
5555
```
56-
- Will cause acc_present_dump() to output variable names and file locations in addition to variable mappings
57-
- Add acc_present_dump() around hotspots to help find problems with data movements
56+
- Will cause `acc_present_dump()` to output variable names and file locations in addition to variable mappings
57+
- Add `acc_present_dump()` around hotspots to help find problems with data movements
5858
- Helps more if adding `CRAY_ACC_DEBUG` environment variable
5959

6060
## NVHPC Compiler Options
@@ -64,7 +64,7 @@ CRAY_ACC_PRESENT_DUMP_SAVE_NAMES=1
6464
```bash
6565
STATIC_RANDOM_SEED=1
6666
```
67-
- Forces the seed returned by RANDOM_SEED to be constant, so generates same sequence of random numbers
67+
- Forces the seed returned by `RANDOM_SEED` to be constant, so it generates the same sequence of random numbers
6868
- Useful for testing issues with randomized data
6969

7070
```bash
@@ -107,7 +107,7 @@ NVCOMPILER_ACC_DEBUG=1
107107
LIBOMPTARGET_PROFILE=run.json
108108
```
109109
- Emits a Chrome-trace (JSON) timeline you can open in chrome://tracing or Speedscope
110-
- great lightweight profiler when Nsight is over-kill.
110+
- Great lightweight profiler when Nsight is overkill.
111111
- Granularity in µs via `LIBOMPTARGET_PROFILE_GRANULARITY` (default 500).
112112

113113
```bash
@@ -128,28 +128,29 @@ LIBOMPTARGET_INFO=<bitmask>
128128
LIBOMPTARGET_DEBUG=1
129129
```
130130
- Developer-level trace (host-side)
131-
- Much noisier than INFO
132-
- only works if the runtime was built with -DOMPTARGET_DEBUG.
131+
- Much noisier than `INFO`
132+
- Only works if the runtime was built with `-DOMPTARGET_DEBUG`.
133133

134134
```bash
135135
LIBOMPTARGET_JIT_OPT_LEVEL=-O{0,1,2,3}
136136
```
137137
- This environment variable can be used to change the optimization pipeline used to optimize the embedded device code as part of the device JIT.
138-
- The value is corresponds to the -O{0,1,2,3} command line argument passed to clang.
138+
- The value corresponds to the `-O{0,1,2,3}` command line argument passed to clang.
139139

140140
```bash
141141
LIBOMPTARGET_JIT_SKIP_OPT=1
142142
```
143143
- This environment variable can be used to skip the optimization pipeline during JIT compilation.
144144
- If set, the image will only be passed through the backend.
145-
- The backend is invoked with the LIBOMPTARGET_JIT_OPT_LEVEL flag.
145+
- The backend is invoked with the `LIBOMPTARGET_JIT_OPT_LEVEL` flag.
146146

147147
## Compiler Documentation
148+
148149
- [Cray & OpenMP Docs](https://cpe.ext.hpe.com/docs/24.11/cce/man7/intro_openmp.7.html#environment-variables)
149150
- [Cray & OpenACC Docs](https://cpe.ext.hpe.com/docs/24.11/cce/man7/intro_openacc.7.html#environment-variables)
150151
- [NVHPC & OpenACC Docs](https://docs.nvidia.com/hpc-sdk/compilers/hpc-compilers-user-guide/index.html?highlight=NVCOMPILER_#environment-variables)
151152
- [NVHPC & OpenMP Docs](https://docs.nvidia.com/hpc-sdk/compilers/hpc-compilers-user-guide/index.html?highlight=NVCOMPILER_#id2)
152153
- [LLVM & OpenMP Docs] (https://openmp.llvm.org/design/Runtimes.html)
153154
- NVHPC is built on top of LLVM
154155
- [OpenMP Docs](https://www.openmp.org/spec-html/5.1/openmp.html)
155-
- [OpenACC Docs](https://www.openacc.org/sites/default/files/inline-files/OpenACC.2.7.pdf)
156+
- [OpenACC Docs](https://www.openacc.org/sites/default/files/inline-files/OpenACC.2.7.pdf)

0 commit comments

Comments
 (0)