Add current generation CPUs and GPUs. (#764)

garloff · web-flow · commit d0dce6de3b2f · 2024-10-04T08:47:14.000+02:00
* Add current generation CPUs and GPUs.
  - So we add AMD Zen-5/5c, intel Sierra Forest (Crestmont) and Granite Rapids (Redwood Cove), ARM AmpereOne and Cortex-72x/NeoverseN3/V3.
  - On the GPU side, add nVidia Hopper (h), C/RDNA 3.5 (3.5) and C/RDNA4 (4), intel Xe12.2 (2) and Arc/12.7/DG2 (3).
  - Added it to the spec and the code. Spec does not get a new revision number, as this is a fully backwards-compatible enhancement (and logical continuation). We could of course update the minor version (and call this v3.2).
* 'h' for Hopper can be confused with High-Perf.
  - And our parser does exactly this. So let's use 'g' for GraceHopper.
* Add cosmetic improvement to CU number.
  - We call these "SMs/CUs/EUs" for nVidia/AMD/intel, in the order of market penetration for these GPUs and the order they are listed in the standard. Improves readability a tiny bit ...

Signed-off-by: Kurt Garloff &lt;kurt@garloff.de&gt;
diff --git a/Standards/scs-0100-v3-flavor-naming.md b/Standards/scs-0100-v3-flavor-naming.md
@@ -366,13 +366,15 @@ The options for arch are as follows:
 The generation is vendor specific and can be left out, but it can only be specified in
 conjunction with a vendor. At present, these values are possible:
 
-| Generation | i (Intel x86-64) | z (AMD x86-64) |  a (AArch64)       | r (RISC-V) |
-| ---------- | ---------------- | -------------- | ------------------ | ---------- |
-| 0          | pre Skylake      | pre Zen        | pre Cortex A76     | TBD        |
-| 1          | Skylake          | Zen-1 (Naples) | A76/NeoN1 class    | TBD        |
-| 2          | Cascade Lake     | Zen-2 (Rome)   | A78/x1/NeoV1 class | TBD        |
-| 3          | Ice Lake         | Zen-3 (Milan)  | A71x/NeoN2 (ARMv9) | TBD        |
-| 4          | Sapphire Rapids  | Zen-4 (Genoa)  |                    | TBD        |
+| Generation | i (Intel x86-64)  | z (AMD x86-64) |  a (AArch64)         | r (RISC-V) |
+| ---------- | ----------------- | -------------- | -------------------- | ---------- |
+| 0          | pre Skylake       | pre Zen        | pre Cortex A76       | TBD        |
+| 1          | Skylake           | Zen-1 (Naples) | A76/NeoN1 class      | TBD        |
+| 2          | Cascade Lake      | Zen-2 (Rome)   | A78/x1/NeoV1 class   | TBD        |
+| 3          | Ice Lake          | Zen-3 (Milan)  | A71x/NeoN2/V2(ARMv9) | TBD        |
+| 4          | Sapphire Rapids   | Zen-4 (Genoa)  | AmpereOne (ARMv8.6)  | TBD        |
+| 5          | Sierra Forest(E)  | Zen-5 (Turin)  | A72x/NeoN3/V3(Av9.2) | TBD        |
+| 6          | Granite Rapids(P) |                |                      | TBD        |
 
 It is recommended to leave out the `0` when specifying the old generation; this will
 help the parser tool, which assumes 0 for an unspecified value and does leave it
@@ -384,8 +386,11 @@ out when generating the name for comparison. In other words: 0 has a meaning of
 We don't differentiate between Zen-4 (Genoa) and Zen-4c (Bergamo); L3 cache per
 Siena core is smaller on Bergamo and the frequency lower but the cores are otherwise
 identical. As we already have a qualifier `h` that allows to specify higher frequencies
-(which Genoa thus may use more and Bergamo less or not), we have enough distinction
-capabilities.
+(which Genoa thus may use more and Bergamo not), we have enough distinction
+capabilities. The same applies to Zen-5 (Turin) and Zen-5c (Turin Dense).
+For intel with the server E-cores (Crestmont), these received their own
+generation assignment, as the difference to the server P-cores (Redwood Cove)
+is more significant.
 
 :::
 
@@ -430,9 +435,9 @@ Note that the vendor letter X is mandatory, generation and processing units are
 | `A`      | AMD    | compute units (CUs)             |
 | `I`      | Intel  | execution units (EUs)           |
 
-For nVidia, the generation N can be f=Fermi, k=Kepler, m=Maxwell, p=Pascal, v=Volta, t=turing, a=Ampere, l=Ada Lovelace, ...,
-for AMD GCN-x=0.x, RDNA1=1, RDNA2=2, RDNA3=3,
-for Intel Gen9=0.9, Xe(12.1)=1, ...
+For nVidia, the generation N can be f=Fermi, k=Kepler, m=Maxwell, p=Pascal, v=Volta, t=turing, a=Ampere, l=Ada Lovelace, g=Grace Hopper, ...,
+for AMD GCN-x=0.x, RDNA1=1, C/RDNA2=2, C/RDNA3=3, C/RDNA3.5=3.5, C/RDNA4=4, ...
+for Intel Gen9=0.9, Xe(12.1/DG1)=1, Xe(12.2)=2, Arc(12.7/DG2)=3 ...
 (Note: This may need further work to properly reflect what's out there.)
 
 The optional `h` suffix to the compute unit count indicates high-performance (e.g. high freq or special
diff --git a/Tests/iaas/flavor-naming/flavor_names.py b/Tests/iaas/flavor-naming/flavor_names.py
@@ -192,9 +192,11 @@ class CPUBrand:
     component_name = "cpubrand"
     cpuvendor = TblAttr("CPU Vendor", {"i": "Intel", "z": "AMD", "a": "ARM", "r": "RISC-V"})
     cpugen = DepTblAttr("#.CPU Gen", cpuvendor, {
-        "i": {None: '(unspecified)', 0: "Unspec/Pre-Skylake", 1: "Skylake", 2: "Cascade Lake", 3: "Ice Lake", 4: "Sapphire Rapids"},
-        "z": {None: '(unspecified)', 0: "Unspec/Pre-Zen", 1: "Zen 1", 2: "Zen 2", 3: "Zen 3", 4: "Zen 4"},
-        "a": {None: '(unspecified)', 0: "Unspec/Pre-A76", 1: "A76/NeoN1", 2: "A78/X1/NeoV1", 3: "A710/NeoN2"},
+        "i": {None: '(unspecified)', 0: "Unspec/Pre-Skylake", 1: "Skylake", 2: "Cascade Lake", 3: "Ice Lake", 4: "Sapphire Rapids",
+              5: 'Sierra Forest (E)', 6: 'Granite Rapids (P)'},
+        "z": {None: '(unspecified)', 0: "Unspec/Pre-Zen", 1: "Zen 1", 2: "Zen 2", 3: "Zen 3", 4: "Zen 4/4c", 5: "Zen 5/5c"},
+        "a": {None: '(unspecified)', 0: "Unspec/Pre-A76", 1: "A76/NeoN1", 2: "A78/X1/NeoV1", 3: "A71x/NeoN2/V2",
+              4: "AmpereOne", 5: "A72x/NeoN3/V3"},
         "r": {None: '(unspecified)', 0: "Unspec"},
     })
     perf = TblAttr("Performance", {"": "Std Perf", "h": "High Perf", "hh": "Very High Perf", "hhh": "Very Very High Perf"})
@@ -213,11 +215,13 @@ class GPU:
     brand = TblAttr("Brand", {"N": "nVidia", "A": "AMD", "I": "Intel"})
     gen = DepTblAttr("Gen", brand, {
         "N": {'': '(unspecified)', "f": "Fermi", "k": "Kepler", "m": "Maxwell", "p": "Pascal",
-              "v": "Volta", "t": "Turing", "a": "Ampere", "l": "AdaLovelace"},
-        "A": {'': '(unspecified)', "0.4": "GCN4.0/Polaris", "0.5": "GCN5.0/Vega", "1": "RDNA1/Navi1x", "2": "RDNA2/Navi2x", "3": "RDNA3/Navi3x"},
-        "I": {'': '(unspecified)', "0.9": "Gen9/Skylake", "0.95": "Gen9.5/KabyLake", "1": "Xe1/Gen12.1", "2": "Xe2"},
+              "v": "Volta", "t": "Turing", "a": "Ampere", "l": "AdaLovelace", "g": "GraceHopper"},
+        "A": {'': '(unspecified)', "0.4": "GCN4.0/Polaris", "0.5": "GCN5.0/Vega", "1": "RDNA1/Navi1x", "2": "C/RDNA2/Navi2x",
+              "3": "C/RDNA3/Navi3x", "3.5": "C/RDNA3.5", "4": "C/RDNA4"},
+        "I": {'': '(unspecified)', "0.9": "Gen9/Skylake", "0.95": "Gen9.5/KabyLake", "1": "Xe1/Gen12.1/DG1", "2": "Xe2/Gen12.2",
+              "3": "Arc/Gen12.7/DG2"},
     })
-    cu = OptIntAttr("#.CU/EU/SM")
+    cu = OptIntAttr("#.N:SMs/A:CUs/I:EUs")
     perf = TblAttr("Performance", {"": "Std Perf", "h": "High Perf", "hh": "Very High Perf", "hhh": "Very Very High Perf"})
 
 
@@ -696,7 +700,7 @@ def prettyname(flavorname, prefix=""):
         stg += _tbl_out(flavorname.gpu, "perf", True)
         stg += _tbl_out(flavorname.gpu, "gen", True)
         if flavorname.gpu.cu is not None:
-            stg += f"(w/ {flavorname.gpu.cu} CU/EU/SM) "
+            stg += f"(w/ {flavorname.gpu.cu} SMs/CUs/EUs) "
     # IB
     if flavorname.ib:
         stg += "and Infiniband "