Revise PR link and usage for rfork blog (#280)

amysaq2023 · web-flow · commit 9416dba75c2e · 2025-12-16T15:59:23.000-08:00
Signed-off-by: Anqi Shen &lt;amy.saq@antgroup.com&gt;
diff --git a/blog/2025-12-10-rfork.md b/blog/2025-12-10-rfork.md
@@ -68,7 +68,7 @@ While NCCL serves as Tensor R-Fork backend by leveraging GPU-Direct RDMA, it doe
 
 ### TransferEngine backend
 
-To achieve non-disturbing weight transfer, we introduce an alternative backend: <a href=https://github.com/sgl-project/sglang/pull/13125>TransferEngine</a>, which leverages GPU-Direct RDMA for efficient data movement[2]. TransferEngine (TE) is a lightweight RDMA-based transfer runtime that runs alongside each TPWorker on the source instance and exposes GPU-resident weight tensors to remote readers without invoking CUDA kernels on the source.
+To achieve non-disturbing weight transfer, we introduce an alternative backend: <a href=https://github.com/sgl-project/sglang/pull/14997>TransferEngine</a>, which leverages GPU-Direct RDMA for efficient data movement[2]. TransferEngine (TE) is a lightweight RDMA-based transfer runtime that runs alongside each TPWorker on the source instance and exposes GPU-resident weight tensors to remote readers without invoking CUDA kernels on the source.
 
 During source SGLang instance initialization:
 1. Each TPWorker (tensor parallel worker) spawns a TransferEngine instance.
@@ -94,17 +94,30 @@ Detailed usage please refer to <a href=https://github.com/sgl-project/sglang/blo
 
 ### Use NCCL as backend
 
+seed instance:
+```shell
+python -m sglang.launch_server [args]
+```
+
+client instance:
 ```shell
 python -m sglang.launch_server [args] \
   --load-format remote_instance	\
   --remote-instance-weight-loader-seed-instance-ip [seed_instance_ip] \
   --remote-instance-weight-loader-seed-instance-service-port [seed_instance_service_port] \
   --remote-instance-weight-loader-send-weights-group-ports [send_weights_nccl_group_ports_list]  \
-  --remote-instance-weight-loader-backend nccl # optional, default is "nccl"
+  --remote-instance-weight-loader-backend nccl
 ```
 
 ### Use TransferEngine as backend
 
+seed instance:
+```shell
+python -m sglang.launch_server [args] \
+  --remote-instance-weight-loader-start-seed-via-transfer-engine
+```
+
+client instance:
 ```shell
 python -m sglang.launch_server [args] \
   --load-format remote_instance	\
@@ -148,7 +161,7 @@ The practice of R-Fork opens up more imaginative possibilities: the key concept
 
 [0] Tensor R-Fork Documentation: <a href=https://github.com/sgl-project/sglang/blob/main/docs/advanced_features/rfork.md>Documentation</a>  
 [1] Tensor R-Fork with NCCL backend: <a href=https://github.com/sgl-project/sglang/pull/8215>PR#8215</a>  
-[2] Tensor R-Fork with TransferEngine backend: <a href=https://github.com/sgl-project/sglang/pull/13125>PR#13125</a>  
+[2] Tensor R-Fork with TransferEngine backend: <a href=https://github.com/sgl-project/sglang/pull/14997>PR#14997</a>  
 [3] Concurrent weights loading from disk: <a href=https://github.com/sgl-project/sglang/pull/7943>PR#7943</a>  
 [4] Tensor R-Fork Planner SGLang RFC: <a href=https://github.com/sgl-project/sglang/issues/12910>Issue#12910</a>  
 [5] TransferEngine: <a href=https://kvcache-ai.github.io/Mooncake/design/transfer-engine.html>https://kvcache-ai.github.io/Mooncake/design/transfer-engine.html</a>,