Skip to content

Commit cf11592

Browse files
authored
Merge pull request #141 from anshgupta1234/nsight_docs
2 parents d49f636 + 63c2ae4 commit cf11592

File tree

3 files changed

+14
-8
lines changed

3 files changed

+14
-8
lines changed

docs/documentation/running.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,12 @@ modified by users.
8989
**Disclaimer**: IBM's JSRUN on LSF-managed computers does not use the traditional node-based approach to
9090
allocate resources. Therefore, the MFC constructs equivalent resource-sets in task and GPU count.
9191

92+
### Profiling with NVIDIA Nsight
93+
94+
MFC provides two different argument to facilitate profiling with NVIDIA Nsight. **Please ensure that the used argument is placed at the end so that their respective flags can be appended.**
95+
- Nsight Systems (Nsys): `./mfc.sh run ... --nsys [nsys flags]` allows one to visualize MFC's system-wide performance with [NVIDIA Nsight Systems](https://developer.nvidia.com/nsight-systems). NSys is best for getting a general understanding of the order and execution times of major subroutines (WENO, Riemann, etc.) in MFC. When used, `--nsys` will run the simulation and generate `.nsys-rep` files in the case directory for all targets. These files can then be imported into Nsight System's GUI, which can be downloaded [here](https://developer.nvidia.com/nsight-systems/get-started#latest-Platforms). It is best to run case files with a few timesteps so that the report files remain small. Learn more about NVIDIA Nsight Systems [here](https://docs.nvidia.com/nsight-systems/UserGuide/index.html).
96+
- Nsight Compute (NCU): `./mfc.sh run ... --ncu [ncu flags]` allows one to conduct kernel-level profiling with [NVIDIA Nsight Compute](https://developer.nvidia.com/nsight-compute). NCU provides profiling information for every subroutine called and is more detailed than NSys. When used, `--ncu` will output profiling information for all subroutines, including elapsed clock cycles, memory used, and more after the simulation is run. Please note that adding this argument will significantly slow down the simulation and should only be used on case files with a few timesteps. Learn more about NVIDIA Nsight Compute [here](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html).
97+
9298
### Restarting Cases
9399

94100
When running a simulation, MFC generates a `./restart_data` folder in the case directory that contains `lustre_*.dat` files that can be used to restart a simulation from saved timesteps. This allows a user to run a simulation to some timestep $X$, then later continue it to run to another timestep $Y$, where $Y > X$. The user can also choose to add new patches at the intermediate timestep.

toolchain/mfc/args.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ def add_common_arguments(p, mask = None):
7070
test.add_argument("-o", "--only", nargs="+", type=str, default=[], metavar="L", help="Only run tests with UUIDs or hashes L.")
7171
test.add_argument("-b", "--binary", choices=binaries, type=str, default=None, help="(Serial) Override MPI execution binary")
7272
test.add_argument("-r", "--relentless", action="store_true", default=False, help="Run all tests, even if multiple fail.")
73-
test.add_argument("-a", "--test-all", action="store_true", default=False, help="Run the Post Process Tests too.")
73+
test.add_argument("-a", "--test-all", action="store_true", default=False, help="Run the Post Process Tests too.")
7474
test.add_argument("--case-optimization", action="store_true", default=False, help="(GPU Optimization) Compile MFC targets with some case parameters hard-coded.")
7575

7676
# === RUN ===
@@ -87,11 +87,11 @@ def add_common_arguments(p, mask = None):
8787
run.add_argument("-a", "--account", metavar="ACCOUNT", type=str, default="", help="(Batch) Account to charge.")
8888
run.add_argument("-@", "--email", metavar="EMAIL", type=str, default="", help="(Batch) Email for job notification.")
8989
run.add_argument("-#", "--name", metavar="NAME", type=str, default="MFC", help="(Batch) Job name.")
90-
run.add_argument("-f", "--flags", metavar="FLAGS", nargs="+", type=str, default=[], help="(Batch) Additional batch options.")
90+
run.add_argument("-f", "--flags", metavar="FLAGS", nargs='+', type=str, default=[], help="(Batch) Additional batch options.")
9191
run.add_argument("-b", "--binary", choices=binaries, type=str, default=None, help="(Interactive) Override MPI execution binary")
9292
run.add_argument("-s", "--scratch", action="store_true", default=False, help="Build from scratch.")
93-
run.add_argument("--ncu", action="store_true", default=False, help="Profile with NVIDIA Nsight Compute.")
94-
run.add_argument("--nsys", action="store_true", default=False, help="Profile with NVIDIA Nsight Systems.")
93+
run.add_argument("--ncu", nargs=argparse.REMAINDER, type=str, help="Profile with NVIDIA Nsight Compute.")
94+
run.add_argument("--nsys", nargs=argparse.REMAINDER, type=str, help="Profile with NVIDIA Nsight Systems.")
9595
run.add_argument( "--dry-run", action="store_true", default=False, help="(Batch) Run without submitting batch file.")
9696
run.add_argument("--case-optimization", action="store_true", default=False, help="(GPU Optimization) Compile MFC targets with some case parameters hard-coded.")
9797
run.add_argument( "--no-build", action="store_true", default=False, help="(Testing) Do not rebuild MFC.")

toolchain/mfc/run/engines.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,18 +8,18 @@
88

99

1010
def profiler_prepend():
11-
if ARG("ncu"):
11+
if ARG("ncu") is not None:
1212
if not common.does_command_exist("ncu"):
1313
raise common.MFCException("Failed to locate [bold green]NVIDIA Nsight Compute[/bold green] (ncu).")
1414

1515
return ["ncu", "--nvtx", "--mode=launch-and-attach",
16-
"--cache-control=none", "--clock-control=none"]
16+
"--cache-control=none", "--clock-control=none"] + ARG("ncu")
1717

18-
if ARG("nsys"):
18+
if ARG("nsys") is not None:
1919
if not common.does_command_exist("nsys"):
2020
raise common.MFCException("Failed to locate [bold green]NVIDIA Nsight Systems[/bold green] (nsys).")
2121

22-
return ["nsys", "profile", "--stats=true", "--trace=mpi,nvtx,openacc"]
22+
return ["nsys", "profile", "--stats=true", "--trace=mpi,nvtx,openacc"] + ARG("nsys")
2323

2424
return []
2525

0 commit comments

Comments
 (0)