Skip to content

Commit 278ad76

Browse files
author
MFC Action
committed
Docs @ 8bddd33
1 parent dc00631 commit 278ad76

22 files changed

+84
-81
lines changed

documentation/doxygen_crawl.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -112,17 +112,17 @@
112112
<a href="md_running.html#autotoc_md96"/>
113113
<a href="md_running.html#autotoc_md97"/>
114114
<a href="md_running.html#autotoc_md98"/>
115-
<a href="md_running.html#restarting_cases"/>
115+
<a href="md_running.html#autotoc_md99"/>
116116
<a href="md_testing.html"/>
117-
<a href="md_testing.html#autotoc_md100"/>
118117
<a href="md_testing.html#autotoc_md101"/>
118+
<a href="md_testing.html#autotoc_md102"/>
119119
<a href="md_visualization.html"/>
120-
<a href="md_visualization.html#autotoc_md103"/>
121120
<a href="md_visualization.html#autotoc_md104"/>
122121
<a href="md_visualization.html#autotoc_md105"/>
123122
<a href="md_visualization.html#autotoc_md106"/>
124123
<a href="md_visualization.html#autotoc_md107"/>
125124
<a href="md_visualization.html#autotoc_md108"/>
125+
<a href="md_visualization.html#autotoc_md109"/>
126126
<a href="pages.html"/>
127127
</body>
128128
</html>

documentation/md_case.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -537,7 +537,7 @@ <h3><a class="anchor" id="autotoc_md15"></a>
537537
<li><code>dt</code> specifies the constant time step size that is used in simulation. The value of <code>dt</code> needs to be sufficiently small such that the Courant-Friedrichs-Lewy (CFL) condition is satisfied.</li>
538538
<li><code>t_step_start</code> and <code>t_step_end</code> define the time steps at which simulation starts and ends, respectively.</li>
539539
</ul>
540-
<p><code>t_step_save</code> is the time step interval for data output during simulation. To newly start the simulation, set <code>t_step_start = 0</code>. To restart simulation from $k$-th time step, set <code>t_step_start = k</code>, see <a href="md_running.html#restarting_cases">Restarting Cases</a>.</p>
540+
<p><code>t_step_save</code> is the time step interval for data output during simulation. To newly start the simulation, set <code>t_step_start = 0</code>. To restart simulation from $k$-th time step, set <code>t_step_start = k</code>, see <a href="running.md#restarting-cases">Restarting Cases</a>.</p>
541541
<h4><a class="anchor" id="autotoc_md16"></a>
542542
Adaptive Time-Stepping</h4>
543543
<ul>
@@ -548,7 +548,7 @@ <h4><a class="anchor" id="autotoc_md16"></a>
548548
<li><code>t_save</code> specifies the time interval between data output during simulation</li>
549549
<li><code>t_stop</code> specifies at what time the simulation should stop</li>
550550
</ul>
551-
<p>To newly start the simulation, set <code>n_start = 0</code>. To restart simulation from $k$-th time step, see <a href="md_running.html#restarting_cases">Restarting Cases</a>.).</p>
551+
<p>To newly start the simulation, set <code>n_start = 0</code>. To restart simulation from $k$-th time step, see <a href="running.md#restarting-cases">Restarting Cases</a>.</p>
552552
<h2><a class="anchor" id="autotoc_md17"></a>
553553
7. Formatted Output</h2>
554554
<table class="markdownTable">

documentation/md_expectedPerformance.html

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -212,32 +212,34 @@ <h1><a class="anchor" id="autotoc_md71"></a>
212212
<tr class="markdownTableRowEven">
213213
<td class="markdownTableBodyRight">AMD EPYC 7513 </td><td class="markdownTableBodyRight">Milan </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">32 cores </td><td class="markdownTableBodyRight">7.4 </td><td class="markdownTableBodyLeft">GNU 12.3.0 </td><td class="markdownTableBodyLeft">GT ICE </td></tr>
214214
<tr class="markdownTableRowOdd">
215-
<td class="markdownTableBodyRight">AMD EPYC 7452 </td><td class="markdownTableBodyRight">Rome </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">32 cores </td><td class="markdownTableBodyRight">8.4 </td><td class="markdownTableBodyLeft">GNU 12.3.0 </td><td class="markdownTableBodyLeft">GT ICE </td></tr>
215+
<td class="markdownTableBodyRight">Intel Xeon 8268 </td><td class="markdownTableBodyRight">Cascade Lake </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">24 cores </td><td class="markdownTableBodyRight">7.5 </td><td class="markdownTableBodyLeft">Intel 2024.2 </td><td class="markdownTableBodyLeft">TAMU ACES </td></tr>
216216
<tr class="markdownTableRowEven">
217-
<td class="markdownTableBodyRight">NVIDIA T4 </td><td class="markdownTableBodyRight">FP32-only GPU </td><td class="markdownTableBodyRight">GPU </td><td class="markdownTableBodyRight">1 GPU </td><td class="markdownTableBodyRight">8.8 </td><td class="markdownTableBodyLeft">NVHPC 24.1 </td><td class="markdownTableBodyLeft">TAMU Faster </td></tr>
217+
<td class="markdownTableBodyRight">AMD EPYC 7452 </td><td class="markdownTableBodyRight">Rome </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">32 cores </td><td class="markdownTableBodyRight">8.4 </td><td class="markdownTableBodyLeft">GNU 12.3.0 </td><td class="markdownTableBodyLeft">GT ICE </td></tr>
218218
<tr class="markdownTableRowOdd">
219-
<td class="markdownTableBodyRight">Intel Xeon 8160 </td><td class="markdownTableBodyRight">Skylake </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">24 cores </td><td class="markdownTableBodyRight">8.9 </td><td class="markdownTableBodyLeft">Intel 2024.0 </td><td class="markdownTableBodyLeft">TACC Stampede3 </td></tr>
219+
<td class="markdownTableBodyRight">NVIDIA T4 </td><td class="markdownTableBodyRight">FP32-only GPU </td><td class="markdownTableBodyRight">GPU </td><td class="markdownTableBodyRight">1 GPU </td><td class="markdownTableBodyRight">8.8 </td><td class="markdownTableBodyLeft">NVHPC 24.1 </td><td class="markdownTableBodyLeft">TAMU Faster </td></tr>
220220
<tr class="markdownTableRowEven">
221-
<td class="markdownTableBodyRight">IBM Power10 </td><td class="markdownTableBodyRight"></td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">24 cores </td><td class="markdownTableBodyRight">10 </td><td class="markdownTableBodyLeft">GNU 13.3.1 </td><td class="markdownTableBodyLeft">GT Rogues Gallery </td></tr>
221+
<td class="markdownTableBodyRight">Intel Xeon 8160 </td><td class="markdownTableBodyRight">Skylake </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">24 cores </td><td class="markdownTableBodyRight">8.9 </td><td class="markdownTableBodyLeft">Intel 2024.0 </td><td class="markdownTableBodyLeft">TACC Stampede3 </td></tr>
222222
<tr class="markdownTableRowOdd">
223-
<td class="markdownTableBodyRight">AMD EPYC 7401 </td><td class="markdownTableBodyRight">Naples </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">24 cores </td><td class="markdownTableBodyRight">10 </td><td class="markdownTableBodyLeft">GNU 10.3.1 </td><td class="markdownTableBodyLeft">LLNL Corona </td></tr>
223+
<td class="markdownTableBodyRight">IBM Power10 </td><td class="markdownTableBodyRight"></td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">24 cores </td><td class="markdownTableBodyRight">10 </td><td class="markdownTableBodyLeft">GNU 13.3.1 </td><td class="markdownTableBodyLeft">GT Rogues Gallery </td></tr>
224224
<tr class="markdownTableRowEven">
225-
<td class="markdownTableBodyRight">Intel Xeon 6226 </td><td class="markdownTableBodyRight">Cascade Lake </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">12 cores </td><td class="markdownTableBodyRight">17 </td><td class="markdownTableBodyLeft">GNU 12.3.0 </td><td class="markdownTableBodyLeft">GT ICE </td></tr>
225+
<td class="markdownTableBodyRight">AMD EPYC 7401 </td><td class="markdownTableBodyRight">Naples </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">24 cores </td><td class="markdownTableBodyRight">10 </td><td class="markdownTableBodyLeft">GNU 10.3.1 </td><td class="markdownTableBodyLeft">LLNL Corona </td></tr>
226226
<tr class="markdownTableRowOdd">
227-
<td class="markdownTableBodyRight">Apple M1 Max </td><td class="markdownTableBodyRight"></td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">10 cores </td><td class="markdownTableBodyRight">20 </td><td class="markdownTableBodyLeft">GNU 14.1.0 </td><td class="markdownTableBodyLeft">N/A </td></tr>
227+
<td class="markdownTableBodyRight">Intel Xeon 6226 </td><td class="markdownTableBodyRight">Cascade Lake </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">12 cores </td><td class="markdownTableBodyRight">17 </td><td class="markdownTableBodyLeft">GNU 12.3.0 </td><td class="markdownTableBodyLeft">GT ICE </td></tr>
228228
<tr class="markdownTableRowEven">
229-
<td class="markdownTableBodyRight">IBM Power9 </td><td class="markdownTableBodyRight"></td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">20 cores </td><td class="markdownTableBodyRight">21 </td><td class="markdownTableBodyLeft">GNU 9.1.0 </td><td class="markdownTableBodyLeft">OLCF Summit </td></tr>
229+
<td class="markdownTableBodyRight">Apple M1 Max </td><td class="markdownTableBodyRight"></td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">10 cores </td><td class="markdownTableBodyRight">20 </td><td class="markdownTableBodyLeft">GNU 14.1.0 </td><td class="markdownTableBodyLeft">N/A </td></tr>
230230
<tr class="markdownTableRowOdd">
231-
<td class="markdownTableBodyRight">Cavium ThunderX2 </td><td class="markdownTableBodyRight">Arm </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">32 cores </td><td class="markdownTableBodyRight">21 </td><td class="markdownTableBodyLeft">GNU 13.2.0 </td><td class="markdownTableBodyLeft">SBU Ookami </td></tr>
231+
<td class="markdownTableBodyRight">IBM Power9 </td><td class="markdownTableBodyRight"></td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">20 cores </td><td class="markdownTableBodyRight">21 </td><td class="markdownTableBodyLeft">GNU 9.1.0 </td><td class="markdownTableBodyLeft">OLCF Summit </td></tr>
232232
<tr class="markdownTableRowEven">
233-
<td class="markdownTableBodyRight">Arm Cortex-A78AE </td><td class="markdownTableBodyRight">Arm, BlueField3 </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">16 cores </td><td class="markdownTableBodyRight">25 </td><td class="markdownTableBodyLeft">NVHPC 24.5 </td><td class="markdownTableBodyLeft">GT Rogues Gallery </td></tr>
233+
<td class="markdownTableBodyRight">Cavium ThunderX2 </td><td class="markdownTableBodyRight">Arm </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">32 cores </td><td class="markdownTableBodyRight">21 </td><td class="markdownTableBodyLeft">GNU 13.2.0 </td><td class="markdownTableBodyLeft">SBU Ookami </td></tr>
234234
<tr class="markdownTableRowOdd">
235-
<td class="markdownTableBodyRight">Intel Xeon E5-2650V4 </td><td class="markdownTableBodyRight">Broadwell </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">12 cores </td><td class="markdownTableBodyRight">27 </td><td class="markdownTableBodyLeft">NVHPC 23.5 </td><td class="markdownTableBodyLeft">GT CSE Internal </td></tr>
235+
<td class="markdownTableBodyRight">Arm Cortex-A78AE </td><td class="markdownTableBodyRight">Arm, BlueField3 </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">16 cores </td><td class="markdownTableBodyRight">25 </td><td class="markdownTableBodyLeft">NVHPC 24.5 </td><td class="markdownTableBodyLeft">GT Rogues Gallery </td></tr>
236236
<tr class="markdownTableRowEven">
237-
<td class="markdownTableBodyRight">Apple M2 </td><td class="markdownTableBodyRight"></td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">8 cores </td><td class="markdownTableBodyRight">32 </td><td class="markdownTableBodyLeft">GNU 14.1.0 </td><td class="markdownTableBodyLeft">N/A </td></tr>
237+
<td class="markdownTableBodyRight">Intel Xeon E5-2650V4 </td><td class="markdownTableBodyRight">Broadwell </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">12 cores </td><td class="markdownTableBodyRight">27 </td><td class="markdownTableBodyLeft">NVHPC 23.5 </td><td class="markdownTableBodyLeft">GT CSE Internal </td></tr>
238238
<tr class="markdownTableRowOdd">
239-
<td class="markdownTableBodyRight">Intel Xeon E7-4850V3 </td><td class="markdownTableBodyRight">Haswell </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">14 cores </td><td class="markdownTableBodyRight">34 </td><td class="markdownTableBodyLeft">GNU 9.4.0 </td><td class="markdownTableBodyLeft">GT CSE Internal </td></tr>
239+
<td class="markdownTableBodyRight">Apple M2 </td><td class="markdownTableBodyRight"></td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">8 cores </td><td class="markdownTableBodyRight">32 </td><td class="markdownTableBodyLeft">GNU 14.1.0 </td><td class="markdownTableBodyLeft">N/A </td></tr>
240240
<tr class="markdownTableRowEven">
241+
<td class="markdownTableBodyRight">Intel Xeon E7-4850V3 </td><td class="markdownTableBodyRight">Haswell </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">14 cores </td><td class="markdownTableBodyRight">34 </td><td class="markdownTableBodyLeft">GNU 9.4.0 </td><td class="markdownTableBodyLeft">GT CSE Internal </td></tr>
242+
<tr class="markdownTableRowOdd">
241243
<td class="markdownTableBodyRight">Fujitsu A64FX </td><td class="markdownTableBodyRight">Arm </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">48 cores </td><td class="markdownTableBodyRight">63 </td><td class="markdownTableBodyLeft">GNU 13.2.0 </td><td class="markdownTableBodyLeft">SBU Ookami </td></tr>
242244
</table>
243245
<p><b>All grind times are in nanoseconds (ns) per grid point (gp) per equation (eq) per right-hand side (rhs) evaluation, so X ns/gp/eq/rhs. Lower is better.</b></p>

documentation/md_running.html

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -192,7 +192,8 @@ <h3><a class="anchor" id="autotoc_md97"></a>
192192
<li>Rocprof (ROC): <code>./mfc.sh run ... -t simulation --roc --hip-trace [rocprof flags]</code> allows one to visualize MFC's system-wide performance with <a href="https://ui.perfetto.dev/">Perfetto UI</a>. When used, <code>--roc</code> will run the simulation and generate files in the case directory for all targets. <code>results.json</code> can then be imported in <a href="https://ui.perfetto.dev/">Perfetto's UI</a>. Learn more about AMD Rocprof <a href="https://rocm.docs.amd.com/projects/rocprofiler/en/docs-5.5.1/rocprof.html">here</a> It is best to run case files with few timesteps to keep the report file sizes manageable.</li>
193193
<li>Omniperf (OMNI): <code>./mfc.sh run ... -t simulation --omni [omniperf flags]</code> allows one to conduct kernel-level profiling with <a href="https://rocm.docs.amd.com/projects/omniperf/en/latest/index.html">AMD's Omniperf</a>. When used, <code>--omni</code> will output profiling information for all subroutines, including rooflines, cache usage, register usage, and more, after the simulation is run. Adding this argument will moderately slow down the simulation and run the MFC executable several times. For this reason, it should only be used with case files with few timesteps.</li>
194194
</ul>
195-
<h2><a class="anchor" id="restarting_cases"></a>
195+
<p><a class="anchor" id="restarting-cases"></a> </p>
196+
<h2><a class="anchor" id="autotoc_md98"></a>
196197
Restarting Cases</h2>
197198
<p>When running a simulation, MFC generates a <code>./restart_data</code> folder in the case directory that contains <code>lustre_*.dat</code> files that can be used to restart a simulation from saved timesteps. This allows a user to simulate some timestep $X$, then continue it to run to another timestep $Y$, where $Y &gt; X$. The user can also choose to add new patches at the intermediate timestep.</p>
198199
<p>If you want to restart a simulation,</p>
@@ -288,7 +289,7 @@ <h2><a class="anchor" id="restarting_cases"></a>
288289
<div class="line">./mfc.sh run examples/1D_vacuum_restart/restart_case.py -t pre_process simulation</div>
289290
<div class="line">./mfc.sh run examples/1D_vacuum_restart/case.py -t post_process</div>
290291
<div class="line">./mfc.sh run examples/1D_vacuum_restart/restart_case.py -t post_process</div>
291-
</div><!-- fragment --><h2><a class="anchor" id="autotoc_md98"></a>
292+
</div><!-- fragment --><h2><a class="anchor" id="autotoc_md99"></a>
292293
Example Runs</h2>
293294
<ul>
294295
<li>Oak Ridge National Laboratory's <a href="https://www.olcf.ornl.gov/summit/">Summit</a>:</li>

documentation/md_testing.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -135,13 +135,13 @@
135135
<div class="headertitle"><div class="title">Testing</div></div>
136136
</div><!--header-->
137137
<div class="contents">
138-
<div class="textblock"><p><a class="anchor" id="autotoc_md99"></a></p>
138+
<div class="textblock"><p><a class="anchor" id="autotoc_md100"></a></p>
139139
<p>To run MFC's test suite, run </p><div class="fragment"><div class="line">./mfc.sh test -j &lt;thread count&gt;</div>
140140
</div><!-- fragment --><p>It will generate and run test cases, comparing their output to previous runs from versions of MFC considered accurate. <em>golden files</em>, stored in the <code>tests/</code> directory contain this data, aggregating <code>.dat</code> files generated when running MFC. A test is considered passing when our error tolerances are met in order to maintain a high level of stability and accuracy. Run <code>./mfc.sh test -h</code> for a full list of accepted arguments.</p>
141141
<p>Most notably, you can consult the full list of tests by running </p><div class="fragment"><div class="line">./mfc.sh test -l</div>
142142
</div><!-- fragment --><p>To restrict to a given range, use the <code>--from</code> (<code>-f</code>) and <code>--to</code> (<code>-t</code>) options. To run a (non-contiguous) subset of tests, use the <code>--only</code> (<code>-o</code>) option instead. To specify a computer, pass the <code>-c</code> flag to <code>./mfc.sh run</code> like so: </p><div class="fragment"><div class="line">./mfc.sh test -j &lt;thread count&gt; -- -c &lt;computer name&gt;</div>
143143
</div><!-- fragment --><p> where <code>&lt;computer name&gt;</code> could be <code>phoenix</code> or any of the others in the <a href="https://github.com/MFlowCode/MFC/tree/master/toolchain/templates">templates</a>). You can create new templates with the appropriate run commands or omit this option. The use of <code>--</code> in the above command passes options to the <code>./mfc.sh run</code> command underlying the <code>./mfc.sh test</code>.</p>
144-
<h2><a class="anchor" id="autotoc_md100"></a>
144+
<h2><a class="anchor" id="autotoc_md101"></a>
145145
Creating Tests</h2>
146146
<p>To (re)generate <em>golden files</em>, append the <code>--generate</code> option: </p><div class="fragment"><div class="line">./mfc.sh test --generate -j 8</div>
147147
</div><!-- fragment --><p>It is recommended that a range be specified when generating golden files for new test cases, as described in the previous section, in an effort not to regenerate the golden files of existing test cases.</p>
@@ -185,7 +185,7 @@ <h2><a class="anchor" id="autotoc_md100"></a>
185185
</ul>
186186
<p>If a trace is empty (that is, the empty string <code>""</code>), it will not appear in the final trace, but any case parameter variations associated with it will still be applied.</p>
187187
<p>Finally, the case is appended to the <code>cases</code> list, which will be returned by the <code>list_cases</code> function.</p>
188-
<h2><a class="anchor" id="autotoc_md101"></a>
188+
<h2><a class="anchor" id="autotoc_md102"></a>
189189
Testing Post Process</h2>
190190
<p>To test the post-processing code, append the <code>-a</code> or <code>--test-all</code> option: </p><div class="fragment"><div class="line">./mfc.sh test -a -j 8</div>
191191
</div><!-- fragment --><p>This argument will re-run the test stack with &lsquo;parallel_io='T&rsquo;<code>, which generates silo_hdf5 files. It will also turn most write parameters (</code>*_wrt<code>) on. Then, it searches through the silo files using</code>h5dump<code>to ensure that there are no</code>NaN<code>s or</code>Infinity<code>s. Although adding this option does not guarantee that accurate</code>.silo` files are generated, it does ensure that the post-process code does not fail or produce malformed data. </p>

0 commit comments

Comments
 (0)