You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<p>Woodward, P. <em>(1984). The numerical simulation of two-dimensional fluid flow with strong shocks. Journal of Computational Physics, 54(1), 115–173. <ahref="https://doi.org/10.1016/0021-9991(84)90140-2">https://doi.org/10.1016/0021-9991(84)90140-2</a></em></p>
<p>Chamarthi, A., & Hoffmann, N., & Nishikawa, H., & Frankel S. (2023). Implicit gradients based conservative numerical scheme for compressible flows. arXiv:2110.05461 </p>
<p>Hillewaert, K. (2013). TestCase C3.5 - DNS of the transition of the Taylor-Green vortex, Re=1600 - Introduction and result summary. 2nd International Workshop on high-order methods for CFD. </p>
269
282
</blockquote>
270
-
<h2><aclass="anchor" id="autotoc_md73"></a>
283
+
<h2><aclass="anchor" id="autotoc_md77"></a>
271
284
Final Condition</h2>
272
285
<p>This figure shows the isosurface with zero q-criterion.</p>
<p>V. A. Titarev, E. F. Toro, Finite-volume WENO schemes for three-dimensional conservation laws, Journal of Computational Physics 201 (1) (2004) 238–260. </p>
<p>P. D. Lax, Weak solutions of nonlinear hyperbolic equations and their numerical computation, Communications on pure and applied mathematics 7 (1) (1954) 159–193. </p>
<p>P. J. Martínez Ferrer, R. Buttay, G. Lehnasch, and A. Mura, “A detailed verification procedure for compressible reactive multicomponent Navier–Stokes solvers”, Computers & Fluids, vol. 89, pp. 88–110, Jan. 2014. Accessed: Oct. 13, 2024. [Online]. Available: <ahref="https://doi.org/10.1016/j.compfluid.2013.10.014">https://doi.org/10.1016/j.compfluid.2013.10.014</a></p>
300
313
</blockquote>
301
314
<blockquoteclass="doxtable">
302
315
<p>H. Chen, C. Si, Y. Wu, H. Hu, and Y. Zhu, “Numerical investigation of the effect of equivalence ratio on the propagation characteristics and performance of rotating detonation engine”, Int. J. Hydrogen Energy, Mar. 2023. Accessed: Oct. 13, 2024. [Online]. Available: <ahref="https://doi.org/10.1016/j.ijhydene.2023.03.190">https://doi.org/10.1016/j.ijhydene.2023.03.190</a></p>
<p>MFC has been benchmarked on several CPUs and GPU devices. This page is a summary of these results.</p>
140
-
<h1><aclass="anchor" id="autotoc_md88"></a>
140
+
<h1><aclass="anchor" id="autotoc_md92"></a>
141
141
Figure of merit: Grind time performance</h1>
142
142
<p>The following table outlines observed performance as nanoseconds per grid point (ns/gp) per equation (eq) per right-hand side (rhs) evaluation (lower is better), also known as the grind time. We solve an example 3D, inviscid, 5-equation model problem with two advected species (8 PDEs) and 8M grid points (158-cubed uniform grid). The numerics are WENO5 finite volume reconstruction and HLLC approximate Riemann solver. This case is located in <code>examples/3D_performance_test</code>. You can run it via <code>./mfc.sh run -n <num_processors> -j $(nproc) ./examples/3D_performance_test/case.py -t pre_process simulation --case-optimization</code> for CPU cases right after building MFC, which will build an optimized version of the code for this case then execute it. For benchmarking GPU devices, you will likely want to use <code>-n <num_gpus></code> where <code><num_gpus></code> should likely be <code>1</code>. If the above does not work on your machine, see the rest of this documentation for other ways to use the <code>./mfc.sh run</code> command.</p>
143
143
<p>Results are for MFC v4.9.3 (July 2024 release), though numbers have not changed meaningfully since then. Similar performance is also seen for other problem configurations, such as the Euler equations (4 PDEs). All results are for the compiler that gave the best performance. Note:</p><ul>
<p><b>All grind times are in nanoseconds (ns) per grid point (gp) per equation (eq) per right-hand side (rhs) evaluation, so X ns/gp/eq/rhs. Lower is better.</b></p>
252
-
<h1><aclass="anchor" id="autotoc_md89"></a>
252
+
<h1><aclass="anchor" id="autotoc_md93"></a>
253
253
Weak scaling</h1>
254
254
<p>Weak scaling results are obtained by increasing the problem size with the number of processes so that work per process remains constant.</p>
255
-
<h2><aclass="anchor" id="autotoc_md90"></a>
255
+
<h2><aclass="anchor" id="autotoc_md94"></a>
256
256
AMD MI250X GPU</h2>
257
257
<p>MFC weask scales to (at least) 65,536 AMD MI250X GPUs on OLCF Frontier with 96% efficiency. This corresponds to 87% of the entire machine.</p>
<p>Strong scaling results are obtained by keeping the problem size constant and increasing the number of processes so that work per process decreases.</p>
270
-
<h2><aclass="anchor" id="autotoc_md94"></a>
270
+
<h2><aclass="anchor" id="autotoc_md98"></a>
271
271
NVIDIA V100 GPU</h2>
272
272
<p>The base case utilizes 8 GPUs with one MPI process per GPU for these tests. The performance is analyzed at two problem sizes: 16M and 64M grid points. The "base case" uses 2M and 8M grid points per process.</p>
<divclass="line">! [ -z "${BOOST_INCLUDE+x}" ] && echo 'Environment is ready!' || echo 'Error: $BOOST_INCLUDE is unset. Please adjust the previous commands to fit with your environment.'</div>
209
209
</div><!-- fragment --><p>They will download the dependencies MFC requires to build itself.</p>
210
-
<h1><aclass="anchor" id="autotoc_md101"></a>
210
+
<h1><aclass="anchor" id="autotoc_md105"></a>
211
211
Building MFC</h1>
212
212
<p>MFC can be built with support for various (compile-time) features:</p>
0 commit comments