You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/documentation/running.md
+6Lines changed: 6 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -89,6 +89,12 @@ modified by users.
89
89
**Disclaimer**: IBM's JSRUN on LSF-managed computers does not use the traditional node-based approach to
90
90
allocate resources. Therefore, the MFC constructs equivalent resource-sets in task and GPU count.
91
91
92
+
### Profiling with NVIDIA Nsight
93
+
94
+
MFC provides two different argument to facilitate profiling with NVIDIA Nsight. **Please ensure that the used argument is placed at the end so that their respective flags can be appended.**
95
+
- Nsight Systems (Nsys): `./mfc.sh run ... --nsys [nsys flags]` allows one to visualize MFC's system-wide performance with [NVIDIA Nsight Systems](https://developer.nvidia.com/nsight-systems). NSys is best for getting a general understanding of the order and execution times of major subroutines (WENO, Riemann, etc.) in MFC. When used, `--nsys` will run the simulation and generate `.nsys-rep` files in the case directory for all targets. These files can then be imported into Nsight System's GUI, which can be downloaded [here](https://developer.nvidia.com/nsight-systems/get-started#latest-Platforms). It is best to run case files with a few timesteps so that the report files remain small. Learn more about NVIDIA Nsight Systems [here](https://docs.nvidia.com/nsight-systems/UserGuide/index.html).
96
+
- Nsight Compute (NCU): `./mfc.sh run ... --ncu [ncu flags]` allows one to conduct kernel-level profiling with [NVIDIA Nsight Compute](https://developer.nvidia.com/nsight-compute). NCU provides profiling information for every subroutine called and is more detailed than NSys. When used, `--ncu` will output profiling information for all subroutines, including elapsed clock cycles, memory used, and more after the simulation is run. Please note that adding this argument will significantly slow down the simulation and should only be used on case files with a few timesteps. Learn more about NVIDIA Nsight Compute [here](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html).
97
+
92
98
### Restarting Cases
93
99
94
100
When running a simulation, MFC generates a `./restart_data` folder in the case directory that contains `lustre_*.dat` files that can be used to restart a simulation from saved timesteps. This allows a user to run a simulation to some timestep $X$, then later continue it to run to another timestep $Y$, where $Y > X$. The user can also choose to add new patches at the intermediate timestep.
test.add_argument("-r", "--relentless", action="store_true", default=False, help="Run all tests, even if multiple fail.")
73
-
test.add_argument("-a", "--test-all", action="store_true", default=False, help="Run the Post Process Tests too.")
73
+
test.add_argument("-a", "--test-all", action="store_true", default=False, help="Run the Post Process Tests too.")
74
74
test.add_argument("--case-optimization", action="store_true", default=False, help="(GPU Optimization) Compile MFC targets with some case parameters hard-coded.")
run.add_argument("-s", "--scratch", action="store_true", default=False, help="Build from scratch.")
93
-
run.add_argument("--ncu", action="store_true", default=False,help="Profile with NVIDIA Nsight Compute.")
94
-
run.add_argument("--nsys", action="store_true", default=False,help="Profile with NVIDIA Nsight Systems.")
93
+
run.add_argument("--ncu", nargs=argparse.REMAINDER, type=str,help="Profile with NVIDIA Nsight Compute.")
94
+
run.add_argument("--nsys", nargs=argparse.REMAINDER, type=str,help="Profile with NVIDIA Nsight Systems.")
95
95
run.add_argument( "--dry-run", action="store_true", default=False, help="(Batch) Run without submitting batch file.")
96
96
run.add_argument("--case-optimization", action="store_true", default=False, help="(GPU Optimization) Compile MFC targets with some case parameters hard-coded.")
97
97
run.add_argument( "--no-build", action="store_true", default=False, help="(Testing) Do not rebuild MFC.")
0 commit comments