You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
| Performance | 990 TFLops (in FP166 Tensor Core) |
137
-
| Specifications | - GH200 SuperChip with 72 ARM Neoverse V2 cores<br />- 480 GB of LPDDR5X DRAM<br />- 96GB of HBM3 GPU memory<br />(Memory is fully merged for up to 576GB of global usable memory) |
| Inter-GPU bandwidth (for clusters up to 256 GH200) | NVlink Switch System 900 GB/s |
140
-
| Format & Features | Single chip up to GH200 clusters. (For larger setup needs, [contact us](https://www.scaleway.com/en/contact-ai-supercomputers/)) |
141
-
| Use cases | - Extra large LLM and DL model inference<br />- HPC |
142
-
| What they are not made for | - Graphism<br /> - (Training) |
Copy file name to clipboardExpand all lines: pages/gpu/reference-content/understanding-nvidia-nvlink.mdx
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ dates:
7
7
posted: 2025-03-13
8
8
---
9
9
10
-
NVLink is NVIDIA's high-bandwidth, low-latency GPU-to-GPU interconnect with built-in resiliency features, available on Scaleway's [H100-SGX Instances](/gpu/reference-content/choosing-gpu-instance-type/#gpu-instances-and-ai-supercomputer-comparison-table). It was designed to significantly improve the performance and efficiency when connecting GPUs, CPUs, and other components within the same node.
10
+
NVLink is NVIDIA's high-bandwidth, low-latency GPU-to-GPU interconnect with built-in resiliency features, available on Scaleway's [H100-SXM Instances](/gpu/reference-content/choosing-gpu-instance-type/#gpu-instances-and-ai-supercomputer-comparison-table). It was designed to significantly improve the performance and efficiency when connecting GPUs, CPUs, and other components within the same node.
11
11
It provides much higher bandwidth (up to 900 GB/s total GPU-to-GPU bandwidth in an 8-GPU configuration) and lower latency compared to traditional PCIe Gen 4 (up to 32 GB/s per link).
12
12
This allows more data to be transferred between GPUs in less time while also reducing latency.
13
13
@@ -21,7 +21,7 @@ Unified Memory Access allows GPUs to access each other's memory directly without
21
21
### Comparison: NVLink vs. PCIe
22
22
NVLink and PCI Express (PCIe) are both used for GPU communication, but NVLink is specifically designed to address the bandwidth and latency bottlenecks of PCIe in multi-GPU setups.
|**Bandwidth**| Up to 900 GB/s (aggregate, multi-GPU) | 128 GB/s (x16 bidirectional) |
@@ -31,4 +31,4 @@ NVLink and PCI Express (PCIe) are both used for GPU communication, but NVLink is
31
31
|**Scalability**| Multi-GPU direct connection via NVSwitch | Limited by PCIe lanes |
32
32
|**Efficiency**| Optimized for GPU workloads | More general-purpose |
33
33
34
-
In summary, NVLink, available on [H100-SGX Instances](/gpu/reference-content/choosing-gpu-instance-type/#gpu-instances-and-ai-supercomputer-comparison-table), is **superior** for **multi-GPU AI and HPC** workloads due to its **higher bandwidth, lower latency, and memory-sharing capabilities**, while PCIe remains essential for broader system connectivity and general computing.
34
+
In summary, NVLink, available on [H100-SXM Instances](/gpu/reference-content/choosing-gpu-instance-type/#gpu-instances-and-ai-supercomputer-comparison-table), is **superior** for **multi-GPU AI and HPC** workloads due to its **higher bandwidth, lower latency, and memory-sharing capabilities**, while PCIe remains essential for broader system connectivity and general computing.
0 commit comments