Skip to content

Commit 63c2ae4

Browse files
Add docs for Nsight
Update running.md Update running.md Co-Authored-By: Henry Le Berre <[email protected]>
1 parent 970a821 commit 63c2ae4

File tree

3 files changed

+14
-8
lines changed

3 files changed

+14
-8
lines changed

docs/documentation/running.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,12 @@ modified by users.
8989
**Disclaimer**: IBM's JSRUN on LSF-managed computers does not use the traditional node-based approach to
9090
allocate resources. Therefore, the MFC constructs equivalent resource-sets in task and GPU count.
9191

92+
### Profiling with NVIDIA Nsight
93+
94+
MFC provides two different argument to facilitate profiling with NVIDIA Nsight. **Please ensure that the used argument is placed at the end so that their respective flags can be appended.**
95+
- Nsight Systems (Nsys): `./mfc.sh run ... --nsys [nsys flags]` allows one to visualize MFC's system-wide performance with [NVIDIA Nsight Systems](https://developer.nvidia.com/nsight-systems). NSys is best for getting a general understanding of the order and execution times of major subroutines (WENO, Riemann, etc.) in MFC. When used, `--nsys` will run the simulation and generate `.nsys-rep` files in the case directory for all targets. These files can then be imported into Nsight System's GUI, which can be downloaded [here](https://developer.nvidia.com/nsight-systems/get-started#latest-Platforms). It is best to run case files with a few timesteps so that the report files remain small. Learn more about NVIDIA Nsight Systems [here](https://docs.nvidia.com/nsight-systems/UserGuide/index.html).
96+
- Nsight Compute (NCU): `./mfc.sh run ... --ncu [ncu flags]` allows one to conduct kernel-level profiling with [NVIDIA Nsight Compute](https://developer.nvidia.com/nsight-compute). NCU provides profiling information for every subroutine called and is more detailed than NSys. When used, `--ncu` will output profiling information for all subroutines, including elapsed clock cycles, memory used, and more after the simulation is run. Please note that adding this argument will significantly slow down the simulation and should only be used on case files with a few timesteps. Learn more about NVIDIA Nsight Compute [here](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html).
97+
9298
### Restarting Cases
9399

94100
When running a simulation, MFC generates a `./restart_data` folder in the case directory that contains `lustre_*.dat` files that can be used to restart a simulation from saved timesteps. This allows a user to run a simulation to some timestep $X$, then later continue it to run to another timestep $Y$, where $Y > X$. The user can also choose to add new patches at the intermediate timestep.

toolchain/mfc/args.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ def add_common_arguments(p, mask = None):
6969
test.add_argument("-o", "--only", nargs="+", type=str, default=[], metavar="L", help="Only run tests with UUIDs or hashes L.")
7070
test.add_argument("-b", "--binary", choices=binaries, type=str, default=None, help="(Serial) Override MPI execution binary")
7171
test.add_argument("-r", "--relentless", action="store_true", default=False, help="Run all tests, even if multiple fail.")
72-
test.add_argument("-a", "--test-all", action="store_true", default=False, help="Run the Post Process Tests too.")
72+
test.add_argument("-a", "--test-all", action="store_true", default=False, help="Run the Post Process Tests too.")
7373
test.add_argument("--case-optimization", action="store_true", default=False, help="(GPU Optimization) Compile MFC targets with some case parameters hard-coded.")
7474

7575
# === RUN ===
@@ -86,11 +86,11 @@ def add_common_arguments(p, mask = None):
8686
run.add_argument("-a", "--account", metavar="ACCOUNT", type=str, default="", help="(Batch) Account to charge.")
8787
run.add_argument("-@", "--email", metavar="EMAIL", type=str, default="", help="(Batch) Email for job notification.")
8888
run.add_argument("-#", "--name", metavar="NAME", type=str, default="MFC", help="(Batch) Job name.")
89-
run.add_argument("-f", "--flags", metavar="FLAGS", nargs="+", type=str, default=[], help="(Batch) Additional batch options.")
89+
run.add_argument("-f", "--flags", metavar="FLAGS", nargs='+', type=str, default=[], help="(Batch) Additional batch options.")
9090
run.add_argument("-b", "--binary", choices=binaries, type=str, default=None, help="(Interactive) Override MPI execution binary")
9191
run.add_argument("-s", "--scratch", action="store_true", default=False, help="Build from scratch.")
92-
run.add_argument("--ncu", action="store_true", default=False, help="Profile with NVIDIA Nsight Compute.")
93-
run.add_argument("--nsys", action="store_true", default=False, help="Profile with NVIDIA Nsight Systems.")
92+
run.add_argument("--ncu", nargs=argparse.REMAINDER, type=str, help="Profile with NVIDIA Nsight Compute.")
93+
run.add_argument("--nsys", nargs=argparse.REMAINDER, type=str, help="Profile with NVIDIA Nsight Systems.")
9494
run.add_argument( "--dry-run", action="store_true", default=False, help="(Batch) Run without submitting batch file.")
9595
run.add_argument("--case-optimization", action="store_true", default=False, help="(GPU Optimization) Compile MFC targets with some case parameters hard-coded.")
9696
run.add_argument( "--no-build", action="store_true", default=False, help="(Testing) Do not rebuild MFC.")

toolchain/mfc/run/engines.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,18 +8,18 @@
88

99

1010
def profiler_prepend():
11-
if ARG("ncu"):
11+
if ARG("ncu") is not None:
1212
if not common.does_command_exist("ncu"):
1313
raise common.MFCException("Failed to locate [bold green]NVIDIA Nsight Compute[/bold green] (ncu).")
1414

1515
return ["ncu", "--nvtx", "--mode=launch-and-attach",
16-
"--cache-control=none", "--clock-control=none"]
16+
"--cache-control=none", "--clock-control=none"] + ARG("ncu")
1717

18-
if ARG("nsys"):
18+
if ARG("nsys") is not None:
1919
if not common.does_command_exist("nsys"):
2020
raise common.MFCException("Failed to locate [bold green]NVIDIA Nsight Systems[/bold green] (nsys).")
2121

22-
return ["nsys", "profile", "--stats=true", "--trace=mpi,nvtx,openacc"]
22+
return ["nsys", "profile", "--stats=true", "--trace=mpi,nvtx,openacc"] + ARG("nsys")
2323

2424
return []
2525

0 commit comments

Comments
 (0)