Skip to content

Commit 6c18780

Browse files
authored
Update expectedPerformance.md (#539)
1 parent 171343a commit 6c18780

File tree

2 files changed

+14
-11
lines changed

2 files changed

+14
-11
lines changed

docs/documentation/expectedPerformance.md

Lines changed: 13 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,23 @@
1-
# Performance Results
1+
# Performance
22

33
MFC has been benchmarked on several CPUs and GPU devices.
44
This page shows a summary of these results.
55

6-
## Expected time-steps/hour
6+
## Figure of merit: Grind time performance
77

8-
The following table outlines observed performance as nanoseconds per grid point (ns/GP) per equation (eq) per right-hand side (rhs) evaluation (lower is better).
8+
The following table outlines observed performance as nanoseconds per grid point (ns/GP) per equation (eq) per right-hand side (rhs) evaluation (lower is better), also known as the grind time.
99
We solve an example 3D, inviscid, 5-equation model problem with two advected species (8 PDEs) and 8M grid points (158-cubed uniform grid).
1010
The numerics are WENO5 finite volume reconstruction and HLLC approximate Riemann solver.
1111
This case is located in `examples/3D_performance_test`.
12-
We report results for various numbers of grid points per CPU die (or GPU device) and hardware.
12+
1313
Results are for MFC v4.9.3 (July 2024 release), though numbers have not changed meaningfully since then.
1414
All results are for the compiler that gave the best performance.
15-
CPU results may be performed on CPUs with more cores than reported in the table; we report results for the best performance given the full processor die by checking the performance for different core counts on that device.
16-
GPU results on single-precision (SP) GPUs performed computation in double-precision via conversion in compiler/software; these numbers are _not_ for single-precision computation.
15+
Note:
16+
* CPU results may be performed on CPUs with more cores than reported in the table; we report results for the best performance given the full processor die by checking the performance for different core counts on that device.
17+
These are reported as (X/Y cores), where X is the used cores, and Y is the total on the die.
18+
* GPU results on single-precision (SP) GPUs performed computation in double-precision via conversion in compiler/software; these numbers are _not_ for single-precision computation.
1719
AMD MI250X GPUs have two graphics compute dies (GCDs) per MI250X device; we report results for one GCD, though one can quickly estimate full MI250X runtime by halving the single GCD grind time number.
1820

19-
2021
| Hardware | | Grind Time | Compiler | Computer |
2122
| ---: | ----: | :----: | :--- | :--- |
2223
| NVIDIA GH200 (GPU only) | 1 GPU | 0.32 | NVHPC 24.1 | GT Rogues Gallery |
@@ -27,9 +28,11 @@ AMD MI250X GPUs have two graphics compute dies (GCDs) per MI250X device; we repo
2728
| AMD MI250X | 1 __GCD__ | 1.09 | CCE 16.0.1 | OLCF Frontier |
2829
| NVIDIA A40 (SP GPU) | 1 GPU | 3.3 | NVHPC 22.11 | NCSA Delta |
2930
| NVIDIA RTX6000 (SP GPU) | 1 GPU | 3.9 | NVHPC 22.11 | GT Phoenix |
30-
| Apple M1 Max | 8 cores | 72 | GNU 14.1.0 | N/A |
31-
| AMD EPYC 7713 | 32 cores | 137 | GNU 12.1.0 | GT Phoenix |
32-
| Intel Xeon Gold 6226 | 12 cores | 152 | Intel oneAPI 2022.1 | GT Phoenix |
31+
| Apple M1 Max | 8/10 cores | 72 | GNU 14.1.0 | N/A |
32+
| AMD EPYC 9534 (Genoa) | 64/64 cores | 96 | GNU 12.3.0 | GT Phoenix |
33+
| Intel Xeon Gold 6454S (Sapphire Rapids) | 16/32 cores | 111 | NVHPC 24.5 | GT Rogues Gallery |
34+
| AMD EPYC 7713 (Milan) | 32/64 cores | 137 | GNU 12.1.0 | GT Phoenix |
35+
| Intel Xeon Gold 6226 (Cascade Lake) | 12/12 cores | 152 | Intel oneAPI 2022.1 | GT Phoenix |
3336

3437
__All grind times are in nanoseconds (ns) per grid point (gp) per equation (eq) per right-hand side (rhs) evaluation, so X ns/gp/eq/rhs. Lower is better.__
3538

docs/documentation/readme.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
- [Example Cases](examples.md)
99
- [Running MFC](running.md)
1010
- [Flow Visualization](visualization.md)
11-
- [Performance Results](expectedPerformance.md)
11+
- [Performance](expectedPerformance.md)
1212
- [MFC's Authors](authors.md)
1313
- [References](references.md)
1414

0 commit comments

Comments
 (0)