add not about --tensor-split

rgerganov · rgerganov · commit 0ccf0ba7e74e · 2025-10-06T16:18:47.000+03:00
diff --git a/tools/rpc/README.md b/tools/rpc/README.md
@@ -80,7 +80,8 @@ Finally, when running `llama-cli` or `llama-server`, use the `--rpc` option to s
 $ llama-cli -hf ggml-org/gemma-3-1b-it-GGUF -ngl 99 --rpc 192.168.88.10:50052,192.168.88.11:50052
 ```
 
-This way you can offload model layers to both local and remote devices.
+By default, the ggml scheduler distributes model weights across all available devices -- both local and remote -- in proportion to each device's available memory.
+You can override this behavior with the `--tensor-split` option and set custom proportions when splitting tensor data across devices.
 
 ### Local cache