Skip to content

Commit 8111b51

Browse files
committed
Minor cosmetics in Readme
1 parent ee3e739 commit 8111b51

File tree

1 file changed

+21
-21
lines changed

1 file changed

+21
-21
lines changed

README.md

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ Works with any GPU in Windows, Linux, macOS and Android.
5656
| Device Name | NVIDIA H100 80GB HBM3 |
5757
| Device Vendor | NVIDIA Corporation |
5858
| Device Driver | 565.57.01 (Linux) |
59-
| OpenCL Version | OpenCL C 1.2 |
59+
| OpenCL Version | OpenCL C 3.0 |
6060
| Compute Units | 132 at 1980 MHz (16896 cores, 66.908 TFLOPs/s) |
6161
| Memory, Cache | 81105 MB VRAM, 4224 KB global / 48 KB local |
6262
| Buffer Limits | 20276 MB global, 64 KB constant |
@@ -80,30 +80,30 @@ Works with any GPU in Windows, Linux, macOS and Android.
8080
```
8181
```
8282
|----------------.------------------------------------------------------------|
83-
| Device ID | 2 |
84-
| Device Name | AMD Instinct MI210 |
83+
| Device ID | 0 |
84+
| Device Name | AMD Instinct MI300X |
8585
| Device Vendor | Advanced Micro Devices, Inc. |
86-
| Device Driver | 3625.0 (HSA1.1,LC) (Linux) |
86+
| Device Driver | 3635.0 (HSA1.1,LC) (Linux) |
8787
| OpenCL Version | OpenCL C 2.0 |
88-
| Compute Units | 104 at 1700 MHz (6656 cores, 22.630 TFLOPs/s) |
89-
| Memory, Cache | 65520 MB VRAM, 16 KB global / 64 KB local |
90-
| Buffer Limits | 65520 MB global, 67092480 KB constant |
88+
| Compute Units | 304 at 2100 MHz (19456 cores, 81.715 TFLOPs/s) |
89+
| Memory, Cache | 196592 MB VRAM, 32 KB global / 64 KB local |
90+
| Buffer Limits | 196592 MB global, 201310208 KB constant |
9191
|----------------'------------------------------------------------------------|
9292
| Info: OpenCL C code successfully compiled. |
93-
| FP64 compute 17.681 TFLOPs/s (2/3 ) |
94-
| FP32 compute 20.007 TFLOPs/s ( 1x ) |
95-
| FP16 compute 39.594 TFLOPs/s ( 2x ) |
96-
| INT64 compute 1.515 TIOPs/s (1/16) |
97-
| INT32 compute 9.877 TIOPs/s (1/2 ) |
98-
| INT16 compute 19.532 TIOPs/s ( 1x ) |
99-
| INT8 compute 36.307 TIOPs/s ( 2x ) |
100-
| Memory Bandwidth ( coalesced read ) 993.82 GB/s |
101-
| Memory Bandwidth ( coalesced write) 999.76 GB/s |
102-
| Memory Bandwidth (misaligned read ) 1325.91 GB/s |
103-
| Memory Bandwidth (misaligned write) 635.20 GB/s |
104-
| PCIe Bandwidth (send ) 28.72 GB/s |
105-
| PCIe Bandwidth ( receive ) 28.51 GB/s |
106-
| PCIe Bandwidth ( bidirectional) (Gen4 x16) 28.61 GB/s |
93+
| FP64 compute 54.944 TFLOPs/s (2/3 ) |
94+
| FP32 compute 130.000 TFLOPs/s ( 2x ) |
95+
| FP16 compute 141.320 TFLOPs/s ( 2x ) |
96+
| INT64 compute 3.666 TIOPs/s (1/24) |
97+
| INT32 compute 47.736 TIOPs/s (2/3 ) |
98+
| INT16 compute 69.022 TIOPs/s ( 1x ) |
99+
| INT8 compute 106.178 TIOPs/s ( 1x ) |
100+
| Memory Bandwidth ( coalesced read ) 3756.64 GB/s |
101+
| Memory Bandwidth ( coalesced write) 4686.31 GB/s |
102+
| Memory Bandwidth (misaligned read ) 3881.24 GB/s |
103+
| Memory Bandwidth (misaligned write) 2491.25 GB/s |
104+
| PCIe Bandwidth (send ) 54.57 GB/s |
105+
| PCIe Bandwidth ( receive ) 55.79 GB/s |
106+
| PCIe Bandwidth ( bidirectional) (Gen4 x16) 55.21 GB/s |
107107
|-----------------------------------------------------------------------------|
108108
```
109109
```

0 commit comments

Comments
 (0)