You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pages/gpu/reference-content/migration-h100.mdx
+6-8Lines changed: 6 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,7 +16,7 @@ For optimal availability and performance, we recommend switching from **H100-2-8
16
16
There are two primary scenarios: migrating **Kubernetes (Kapsule)** workloads or **standalone** workloads.
17
17
18
18
<Messagetype="important">
19
-
Always ensure that your **data is backed up** before performing any operations that could affect it. Keep in mind that **scratch storage** is ephemeral and does not survive once the Instance is stopped: doing a full stop/start cyclewill **erase the scratch data**. However, doing a simple reboot or using the **stop in place**function will keep the data.
19
+
Always make sure your **data is backed up** before performing any operation that could affect it. Remember that **scratch storage** is ephemeral and will not persist after an Instance is fully stopped. A full stop/start cycle—such as during an Instance server migration—will **erase all scratch data**. However, outside of server-type migrations, a simple reboot or using **stop in place** will preserve the data stored on the Instance’s scratch storage.
@@ -86,11 +86,9 @@ H100 PCIe-based GPU Instances are not End-of-Life (EOL), but due to limited avai
86
86
#### Is H100-SXM-2-80G compatible with my current setup?
87
87
Yes — it runs the same CUDA toolchain and supports standard frameworks (PyTorch, TensorFlow, etc.). No changes in your code base are required when upgrading to a SXM-based GPU Instance.
88
88
89
-
#### Why is H100-SXM better for multi-GPU?
90
-
The NVIDIA H100-SXM outperforms the H100-PCIe in multi-GPU configurations due to its superior interconnect and higher power capacity.
91
-
It leverages fourth-generation NVLink and NVSwitch, providing up to 900 GB/s of bidirectional bandwidth for rapid GPU-to-GPU communication, compared to the H100-PCIe's 128 GB/s via PCIe Gen 5, which creates bottlenecks in demanding workloads like large-scale AI training and HPC.
92
-
Additionally, the H100-SXM’s 700W TDP enables higher clock speeds and sustained performance, while the H100-PCIe’s 300-350W TDP limits its throughput.
93
-
For high-communication, multi-GPU tasks, the H100-SXM is the optimal choice, while the H100-PCIe suits less intensive applications with greater flexibility.
89
+
### Why is the H100-SXM better for multi-GPU workloads?
94
90
95
-
#### What if my workload needs more CPU or RAM?
96
-
Let us know via [support ticket](https://console.scaleway.com/support/tickets/create) what your specific requoirements are. Currently we are evaluating options for compute-optimized configurations to complement our GPU offerings.
91
+
The NVIDIA H100-SXM outperforms the H100-PCIe in multi-GPU configurations primarily due to its higher interconnect bandwidth and greater power capacity. It uses fourth-generation NVLink and NVSwitch, delivering up to **900 GB/s of bidirectional bandwidth** for fast GPU-to-GPU communication. In contrast, the H100-PCIe is limited to a **theoretical maximum of 128 GB/s** via PCIe Gen 5, which becomes a bottleneck in communication-heavy workloads such as large-scale AI training and HPC.
92
+
The H100-SXM also provides **HBM3e memory** with up to **3.35 TB/s of bandwidth**, compared to **2 TB/s** with the H100-PCIe’s HBM3, improving performance in memory-bound tasks.
93
+
Additionally, the H100-SXM’s **700W TDP** allows higher sustained clock speeds and throughput, while the H100-PCIe’s **300–350W TDP** imposes stricter performance limits.
94
+
Overall, the H100-SXM is the optimal choice for high-communication, multi-GPU workloads, whereas the H100-PCIe offers more flexibility for less communication-intensive applications.
0 commit comments