Skip to content

Commit 70129b6

Browse files
author
MFC Action
committed
Docs @ 5ee319c
1 parent 7325278 commit 70129b6

File tree

635 files changed

+22881
-13584
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

635 files changed

+22881
-13584
lines changed

documentation/doxygen_crawl.html

Lines changed: 18 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -98,9 +98,9 @@
9898
<a href="md_examples.html#autotoc_md82"/>
9999
<a href="md_examples.html#autotoc_md83"/>
100100
<a href="md_examples.html#autotoc_md84"/>
101+
<a href="md_examples.html#autotoc_md85"/>
102+
<a href="md_examples.html#autotoc_md86"/>
101103
<a href="md_expectedPerformance.html"/>
102-
<a href="md_expectedPerformance.html#autotoc_md86"/>
103-
<a href="md_expectedPerformance.html#autotoc_md87"/>
104104
<a href="md_expectedPerformance.html#autotoc_md88"/>
105105
<a href="md_expectedPerformance.html#autotoc_md89"/>
106106
<a href="md_expectedPerformance.html#autotoc_md90"/>
@@ -109,15 +109,15 @@
109109
<a href="md_expectedPerformance.html#autotoc_md93"/>
110110
<a href="md_expectedPerformance.html#autotoc_md94"/>
111111
<a href="md_expectedPerformance.html#autotoc_md95"/>
112+
<a href="md_expectedPerformance.html#autotoc_md96"/>
113+
<a href="md_expectedPerformance.html#autotoc_md97"/>
112114
<a href="md_getting-started.html"/>
113115
<a href="md_getting-started.html#autotoc_md100"/>
114116
<a href="md_getting-started.html#autotoc_md101"/>
115-
<a href="md_getting-started.html#autotoc_md97"/>
116-
<a href="md_getting-started.html#autotoc_md98"/>
117+
<a href="md_getting-started.html#autotoc_md102"/>
118+
<a href="md_getting-started.html#autotoc_md103"/>
117119
<a href="md_getting-started.html#autotoc_md99"/>
118120
<a href="md_gpuDebugging.html"/>
119-
<a href="md_gpuDebugging.html#autotoc_md103"/>
120-
<a href="md_gpuDebugging.html#autotoc_md104"/>
121121
<a href="md_gpuDebugging.html#autotoc_md105"/>
122122
<a href="md_gpuDebugging.html#autotoc_md106"/>
123123
<a href="md_gpuDebugging.html#autotoc_md107"/>
@@ -126,39 +126,41 @@
126126
<a href="md_gpuDebugging.html#autotoc_md110"/>
127127
<a href="md_gpuDebugging.html#autotoc_md111"/>
128128
<a href="md_gpuDebugging.html#autotoc_md112"/>
129+
<a href="md_gpuDebugging.html#autotoc_md113"/>
130+
<a href="md_gpuDebugging.html#autotoc_md114"/>
129131
<a href="md_gpuParallelization.html"/>
130-
<a href="md_gpuParallelization.html#autotoc_md115"/>
131-
<a href="md_gpuParallelization.html#autotoc_md116"/>
132132
<a href="md_gpuParallelization.html#autotoc_md117"/>
133+
<a href="md_gpuParallelization.html#autotoc_md118"/>
133134
<a href="md_gpuParallelization.html#autotoc_md119"/>
134135
<a href="md_gpuParallelization.html#autotoc_md121"/>
135136
<a href="md_gpuParallelization.html#autotoc_md123"/>
136137
<a href="md_gpuParallelization.html#autotoc_md125"/>
138+
<a href="md_gpuParallelization.html#autotoc_md127"/>
137139
<a href="md_papers.html"/>
138140
<a href="md_readme.html"/>
139-
<a href="md_readme.html#autotoc_md129"/>
140-
<a href="md_readme.html#autotoc_md130"/>
141+
<a href="md_readme.html#autotoc_md131"/>
142+
<a href="md_readme.html#autotoc_md132"/>
141143
<a href="md_references.html"/>
142144
<a href="md_running.html"/>
143-
<a href="md_running.html#autotoc_md133"/>
144-
<a href="md_running.html#autotoc_md134"/>
145145
<a href="md_running.html#autotoc_md135"/>
146146
<a href="md_running.html#autotoc_md136"/>
147147
<a href="md_running.html#autotoc_md137"/>
148148
<a href="md_running.html#autotoc_md138"/>
149149
<a href="md_running.html#autotoc_md139"/>
150+
<a href="md_running.html#autotoc_md140"/>
151+
<a href="md_running.html#autotoc_md141"/>
150152
<a href="md_testing.html"/>
151-
<a href="md_testing.html#autotoc_md141"/>
152-
<a href="md_testing.html#autotoc_md142"/>
153+
<a href="md_testing.html#autotoc_md143"/>
154+
<a href="md_testing.html#autotoc_md144"/>
153155
<a href="md_visualization.html"/>
154-
<a href="md_visualization.html#autotoc_md144"/>
155-
<a href="md_visualization.html#autotoc_md145"/>
156156
<a href="md_visualization.html#autotoc_md146"/>
157157
<a href="md_visualization.html#autotoc_md147"/>
158158
<a href="md_visualization.html#autotoc_md148"/>
159159
<a href="md_visualization.html#autotoc_md149"/>
160160
<a href="md_visualization.html#autotoc_md150"/>
161161
<a href="md_visualization.html#autotoc_md151"/>
162+
<a href="md_visualization.html#autotoc_md152"/>
163+
<a href="md_visualization.html#autotoc_md153"/>
162164
<a href="pages.html"/>
163165
</body>
164166
</html>

documentation/md_case.html

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -705,22 +705,24 @@ <h2><a class="anchor" id="autotoc_md19"></a>
705705
<tr class="markdownTableRowOdd">
706706
<td class="markdownTableBodyRight"><code>qm_wrt</code> </td><td class="markdownTableBodyCenter">Logical </td><td class="markdownTableBodyLeft">Add the Q-criterion to the database </td></tr>
707707
<tr class="markdownTableRowEven">
708-
<td class="markdownTableBodyRight"><code>tau_wrt</code> </td><td class="markdownTableBodyCenter">Logical </td><td class="markdownTableBodyLeft">Add the elastic stress components to the database </td></tr>
708+
<td class="markdownTableBodyRight"><code>liutex_wrt</code> </td><td class="markdownTableBodyCenter">Logical </td><td class="markdownTableBodyLeft">Add the Liutex to the database </td></tr>
709709
<tr class="markdownTableRowOdd">
710-
<td class="markdownTableBodyRight"><code>fd_order</code> </td><td class="markdownTableBodyCenter">Integer </td><td class="markdownTableBodyLeft">Order of finite differences for computing the vorticity and the numerical Schlieren function [1,2,4] </td></tr>
710+
<td class="markdownTableBodyRight"><code>tau_wrt</code> </td><td class="markdownTableBodyCenter">Logical </td><td class="markdownTableBodyLeft">Add the elastic stress components to the database </td></tr>
711711
<tr class="markdownTableRowEven">
712-
<td class="markdownTableBodyRight"><code>schlieren_alpha(i)</code> </td><td class="markdownTableBodyCenter">Real </td><td class="markdownTableBodyLeft">Intensity of the numerical Schlieren computed via <code>alpha(i)</code> </td></tr>
712+
<td class="markdownTableBodyRight"><code>fd_order</code> </td><td class="markdownTableBodyCenter">Integer </td><td class="markdownTableBodyLeft">Order of finite differences for computing the vorticity and the numerical Schlieren function [1,2,4] </td></tr>
713713
<tr class="markdownTableRowOdd">
714-
<td class="markdownTableBodyRight"><code>probe_wrt</code> </td><td class="markdownTableBodyCenter">Logical </td><td class="markdownTableBodyLeft">Write the flow chosen probes data files for each time step </td></tr>
714+
<td class="markdownTableBodyRight"><code>schlieren_alpha(i)</code> </td><td class="markdownTableBodyCenter">Real </td><td class="markdownTableBodyLeft">Intensity of the numerical Schlieren computed via <code>alpha(i)</code> </td></tr>
715715
<tr class="markdownTableRowEven">
716-
<td class="markdownTableBodyRight"><code>num_probes</code> </td><td class="markdownTableBodyCenter">Integer </td><td class="markdownTableBodyLeft">Number of probes </td></tr>
716+
<td class="markdownTableBodyRight"><code>probe_wrt</code> </td><td class="markdownTableBodyCenter">Logical </td><td class="markdownTableBodyLeft">Write the flow chosen probes data files for each time step </td></tr>
717717
<tr class="markdownTableRowOdd">
718-
<td class="markdownTableBodyRight"><code>probe(i)%[x,y,z]</code> </td><td class="markdownTableBodyCenter">Real </td><td class="markdownTableBodyLeft">Coordinates of probe $i$ </td></tr>
718+
<td class="markdownTableBodyRight"><code>num_probes</code> </td><td class="markdownTableBodyCenter">Integer </td><td class="markdownTableBodyLeft">Number of probes </td></tr>
719719
<tr class="markdownTableRowEven">
720-
<td class="markdownTableBodyRight"><code>output_partial_domain</code> </td><td class="markdownTableBodyCenter">Logical </td><td class="markdownTableBodyLeft">Output part of the domain </td></tr>
720+
<td class="markdownTableBodyRight"><code>probe(i)%[x,y,z]</code> </td><td class="markdownTableBodyCenter">Real </td><td class="markdownTableBodyLeft">Coordinates of probe $i$ </td></tr>
721721
<tr class="markdownTableRowOdd">
722-
<td class="markdownTableBodyRight"><code>[x,y,z]_outputbeg</code> </td><td class="markdownTableBodyCenter">Real </td><td class="markdownTableBodyLeft">Beginning of the output domain in the [x,y,z]-direction </td></tr>
722+
<td class="markdownTableBodyRight"><code>output_partial_domain</code> </td><td class="markdownTableBodyCenter">Logical </td><td class="markdownTableBodyLeft">Output part of the domain </td></tr>
723723
<tr class="markdownTableRowEven">
724+
<td class="markdownTableBodyRight"><code>[x,y,z]_outputbeg</code> </td><td class="markdownTableBodyCenter">Real </td><td class="markdownTableBodyLeft">Beginning of the output domain in the [x,y,z]-direction </td></tr>
725+
<tr class="markdownTableRowOdd">
724726
<td class="markdownTableBodyRight"><code>[x,y,z]_outputend</code> </td><td class="markdownTableBodyCenter">Real </td><td class="markdownTableBodyLeft">End of the output domain in the [x,y,z]-direction </td></tr>
725727
</table>
726728
<p>The table lists formatted database output parameters. The parameters define variables that are outputted from simulation and file types and formats of data as well as options for post-processing.</p>
@@ -734,7 +736,7 @@ <h2><a class="anchor" id="autotoc_md19"></a>
734736
<li><code>schlieren_alpha(i)</code> specifies the intensity of the numerical Schlieren of $i$-th component.</li>
735737
<li><code>fd_order</code> specifies the order of the finite difference scheme used to compute the vorticity from the velocity field and the numerical schlieren from the density field using an integer of 1, 2, and 4. <code>fd_order = 1</code>, <code>2</code>, and <code>4</code> correspond to the first, second, and fourth-order finite difference schemes.</li>
736738
<li><code>probe_wrt</code> activates the output of state variables at coordinates specified by <code>probe(i)%[x;y,z]</code>.</li>
737-
<li><code>output_partial_domain</code> activates the output of part of the domain specified by <code>[x,y,z]_outputbeg</code> and <code>[x,y,z]_outputend</code>. This is useful for large domains where only a portion of the domain is of interest. It is not supported when <code>precision = 1</code> and <code>format = 1</code>. It also cannot be enabled with <code>flux_wrt</code>, <code>heat_ratio_wrt</code>, <code>pres_inf_wrt</code>, <code>c_wrt</code>, <code>omega_wrt</code>, <code>ib</code>, <code>schlieren_wrt</code>, or <code>qm_wrt</code>.</li>
739+
<li><code>output_partial_domain</code> activates the output of part of the domain specified by <code>[x,y,z]_outputbeg</code> and <code>[x,y,z]_outputend</code>. This is useful for large domains where only a portion of the domain is of interest. It is not supported when <code>precision = 1</code> and <code>format = 1</code>. It also cannot be enabled with <code>flux_wrt</code>, <code>heat_ratio_wrt</code>, <code>pres_inf_wrt</code>, <code>c_wrt</code>, <code>omega_wrt</code>, <code>ib</code>, <code>schlieren_wrt</code>, <code>qm_wrt</code>, or 'liutex_wrt'.</li>
738740
</ul>
739741
<h2><a class="anchor" id="acoustic-source"></a>
740742
8. Acoustic Source</h2>

documentation/md_examples.html

Lines changed: 15 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -278,38 +278,43 @@ <h2><a class="anchor" id="autotoc_md73"></a>
278278
Initial Condition and Result</h2>
279279
<p><img src="initial-2D_hardcoded_ic-example.png" alt="" width="45%" class="inline"/> <img src="result-2D_hardcoded_ic-example.png" alt="" width="45%" class="inline"/></p>
280280
<h1><a class="anchor" id="autotoc_md74"></a>
281-
Rayleigh-Taylor Instability (3D)</h1>
281+
3D Turbulent Mixing layer (3D)</h1>
282282
<h2><a class="anchor" id="autotoc_md75"></a>
283+
Liutex visualization at transitional state</h2>
284+
<p><img src="result-3D_turb_mixing-example.png" alt="" height="400" class="inline"/></p>
285+
<h1><a class="anchor" id="autotoc_md76"></a>
286+
Rayleigh-Taylor Instability (3D)</h1>
287+
<h2><a class="anchor" id="autotoc_md77"></a>
283288
Final Condition and Linear Theory</h2>
284289
<p><img src="final_condition-3D_rayleigh_taylor-example.png" alt="" height="400" class="inline"/> <img src="linear_theory-3D_rayleigh_taylor-example.png" alt="" height="400" class="inline"/></p>
285-
<h1><a class="anchor" id="autotoc_md76"></a>
290+
<h1><a class="anchor" id="autotoc_md78"></a>
286291
Taylor-Green Vortex (3D)</h1>
287292
<p>Reference: </p><blockquote class="doxtable">
288293
<p>Hillewaert, K. (2013). TestCase C3.5 - DNS of the transition of the Taylor-Green vortex, Re=1600 - Introduction and result summary. 2nd International Workshop on high-order methods for CFD. </p>
289294
</blockquote>
290-
<h2><a class="anchor" id="autotoc_md77"></a>
295+
<h2><a class="anchor" id="autotoc_md79"></a>
291296
Final Condition</h2>
292297
<p>This figure shows the isosurface with zero q-criterion.</p>
293298
<p><img src="result-3D_TaylorGreenVortex-example.png" alt="" height="400" class="inline"/></p>
294-
<h1><a class="anchor" id="autotoc_md78"></a>
299+
<h1><a class="anchor" id="autotoc_md80"></a>
295300
Gas Jet (2D)</h1>
296-
<h2><a class="anchor" id="autotoc_md79"></a>
301+
<h2><a class="anchor" id="autotoc_md81"></a>
297302
Final Condition</h2>
298303
<p><img src="final_condition-2D_jet-example.png" alt="" height="400" class="inline"/></p>
299-
<h1><a class="anchor" id="autotoc_md80"></a>
304+
<h1><a class="anchor" id="autotoc_md82"></a>
300305
1D Multi-Component Inert Shock Tube</h1>
301306
<p>Reference: </p><blockquote class="doxtable">
302307
<p>P. J. Martínez Ferrer, R. Buttay, G. Lehnasch, and A. Mura, “A detailed verification procedure for compressible reactive multicomponent Navier–Stokes solvers”, Computers &amp; Fluids, vol. 89, pp. 88–110, Jan. 2014. Accessed: Oct. 13, 2024. [Online]. Available: <a href="https://doi.org/10.1016/j.compfluid.2013.10.014">https://doi.org/10.1016/j.compfluid.2013.10.014</a> </p>
303308
</blockquote>
304-
<h2><a class="anchor" id="autotoc_md81"></a>
309+
<h2><a class="anchor" id="autotoc_md83"></a>
305310
Initial Condition</h2>
306311
<p><img src="initial-1D_inert_shocktube-example.png" alt="" height="400" class="inline"/></p>
307-
<h2><a class="anchor" id="autotoc_md82"></a>
312+
<h2><a class="anchor" id="autotoc_md84"></a>
308313
Results</h2>
309314
<p><img src="result-1D_inert_shocktube-example.png" alt="" height="400" class="inline"/></p>
310-
<h1><a class="anchor" id="autotoc_md83"></a>
315+
<h1><a class="anchor" id="autotoc_md85"></a>
311316
2D IBM CFL dt (2D)</h1>
312-
<h2><a class="anchor" id="autotoc_md84"></a>
317+
<h2><a class="anchor" id="autotoc_md86"></a>
313318
Result</h2>
314319
<p><img src="result-2D_ibm_cfl_dt-example.png" alt="" height="400" class="inline"/> </p>
315320
</div></div><!-- contents -->

documentation/md_expectedPerformance.html

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -135,9 +135,9 @@
135135
<div class="headertitle"><div class="title">Performance</div></div>
136136
</div><!--header-->
137137
<div class="contents">
138-
<div class="textblock"><p><a class="anchor" id="autotoc_md85"></a></p>
138+
<div class="textblock"><p><a class="anchor" id="autotoc_md87"></a></p>
139139
<p>MFC has been benchmarked on several CPUs and GPU devices. This page is a summary of these results.</p>
140-
<h1><a class="anchor" id="autotoc_md86"></a>
140+
<h1><a class="anchor" id="autotoc_md88"></a>
141141
Figure of merit: Grind time performance</h1>
142142
<p>The following table outlines observed performance as nanoseconds per grid point (ns/gp) per equation (eq) per right-hand side (rhs) evaluation (lower is better), also known as the grind time. We solve an example 3D, inviscid, 5-equation model problem with two advected species (8 PDEs) and 8M grid points (158-cubed uniform grid). The numerics are WENO5 finite volume reconstruction and HLLC approximate Riemann solver. This case is located in <code>examples/3D_performance_test</code>. You can run it via <code>./mfc.sh run -n &lt;num_processors&gt; -j $(nproc) ./examples/3D_performance_test/case.py -t pre_process simulation --case-optimization</code> for CPU cases right after building MFC, which will build an optimized version of the code for this case then execute it. For benchmarking GPU devices, you will likely want to use <code>-n &lt;num_gpus&gt;</code> where <code>&lt;num_gpus&gt;</code> should likely be <code>1</code>. If the above does not work on your machine, see the rest of this documentation for other ways to use the <code>./mfc.sh run</code> command.</p>
143143
<p>Results are for MFC v4.9.3 (July 2024 release), though numbers have not changed meaningfully since then. Similar performance is also seen for other problem configurations, such as the Euler equations (4 PDEs). All results are for the compiler that gave the best performance. Note:</p><ul>
@@ -249,34 +249,34 @@ <h1><a class="anchor" id="autotoc_md86"></a>
249249
<td class="markdownTableBodyRight">Fujitsu A64FX </td><td class="markdownTableBodyRight">Arm </td><td class="markdownTableBodyRight">CPU </td><td class="markdownTableBodyRight">48 cores </td><td class="markdownTableBodyRight">63 </td><td class="markdownTableBodyLeft">GNU 13.2.0 </td><td class="markdownTableBodyLeft">SBU Ookami </td></tr>
250250
</table>
251251
<p><b>All grind times are in nanoseconds (ns) per grid point (gp) per equation (eq) per right-hand side (rhs) evaluation, so X ns/gp/eq/rhs. Lower is better.</b></p>
252-
<h1><a class="anchor" id="autotoc_md87"></a>
252+
<h1><a class="anchor" id="autotoc_md89"></a>
253253
Weak scaling</h1>
254254
<p>Weak scaling results are obtained by increasing the problem size with the number of processes so that work per process remains constant.</p>
255-
<h2><a class="anchor" id="autotoc_md88"></a>
255+
<h2><a class="anchor" id="autotoc_md90"></a>
256256
AMD MI250X GPU</h2>
257257
<p>MFC weask scales to (at least) 65,536 AMD MI250X GPUs on OLCF Frontier with 96% efficiency. This corresponds to 87% of the entire machine.</p>
258258
<p><img src="../res/weakScaling/frontier.svg" alt="" style="height: 50%; width:50%; border-radius: 10pt pointer-events: none;" class="inline"/></p>
259-
<h2><a class="anchor" id="autotoc_md89"></a>
259+
<h2><a class="anchor" id="autotoc_md91"></a>
260260
NVIDIA V100 GPU</h2>
261261
<p>MFC weak scales to (at least) 13,824 V100 NVIDIA V100 GPUs on OLCF Summit with 97% efficiency. This corresponds to 50% of the entire machine.</p>
262262
<p><img src="../res/weakScaling/summit.svg" alt="" style="height: 50%; width:50%; border-radius: 10pt pointer-events: none;" class="inline"/></p>
263-
<h2><a class="anchor" id="autotoc_md90"></a>
263+
<h2><a class="anchor" id="autotoc_md92"></a>
264264
IBM Power9 CPU</h2>
265265
<p>MFC Weak scales to 13,824 Power9 CPU cores on OLCF Summit to within 1% of ideal scaling.</p>
266266
<p><img src="../res/weakScaling/cpuScaling.svg" alt="" style="height: 50%; width:50%; border-radius: 10pt pointer-events: none;" class="inline"/></p>
267-
<h1><a class="anchor" id="autotoc_md91"></a>
267+
<h1><a class="anchor" id="autotoc_md93"></a>
268268
Strong scaling</h1>
269269
<p>Strong scaling results are obtained by keeping the problem size constant and increasing the number of processes so that work per process decreases.</p>
270-
<h2><a class="anchor" id="autotoc_md92"></a>
270+
<h2><a class="anchor" id="autotoc_md94"></a>
271271
NVIDIA V100 GPU</h2>
272272
<p>The base case utilizes 8 GPUs with one MPI process per GPU for these tests. The performance is analyzed at two problem sizes: 16M and 64M grid points. The "base case" uses 2M and 8M grid points per process.</p>
273-
<h3><a class="anchor" id="autotoc_md93"></a>
273+
<h3><a class="anchor" id="autotoc_md95"></a>
274274
16M Grid Points</h3>
275275
<p><img src="../res/strongScaling/strongScaling16.svg" alt="" style="width: 50%; border-radius: 10pt pointer-events: none;" class="inline"/></p>
276-
<h3><a class="anchor" id="autotoc_md94"></a>
276+
<h3><a class="anchor" id="autotoc_md96"></a>
277277
64M Grid Points</h3>
278278
<p><img src="../res/strongScaling/strongScaling64.svg" alt="" style="width: 50%; border-radius: 10pt pointer-events: none;" class="inline"/></p>
279-
<h2><a class="anchor" id="autotoc_md95"></a>
279+
<h2><a class="anchor" id="autotoc_md97"></a>
280280
IBM Power9 CPU</h2>
281281
<p>CPU strong scaling tests are done with problem sizes of 16, 32, and 64M grid points, with the base case using 2, 4, and 8M cells per process.</p>
282282
<p><img src="../res/strongScaling/cpuStrongScaling.svg" alt="" style="width: 50%; border-radius: 10pt pointer-events: none;" class="inline"/> </p>

0 commit comments

Comments
 (0)