diff --git a/README.md b/README.md index 1764e197..f1a2bd16 100644 --- a/README.md +++ b/README.md @@ -21,7 +21,7 @@ workloads currently. [Kubernetes]:https://kubernetes.io [Architecture Documentation]:docs/architecture.md [Gateway API Inference Extension (GIE)]:https://github.com/kubernetes-sigs/gateway-api-inference-extension -[P/D Disaggregation]:docs/dp.md +[P/D Disaggregation]:docs/pd_disagg.md [Gateway API]:https://github.com/kubernetes-sigs/gateway-api [Envoy]:https://github.com/envoyproxy/envoy [ext-proc]:https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_proc_filter diff --git a/docs/architecture.md b/docs/architecture.md index 2b981151..8b54fd5c 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -302,7 +302,7 @@ The **vLLM sidecar** handles orchestration between Prefill and Decode stages. It - Local memory management - Experimental protocol compatibility -> **Note**: The detailed P/D design is available in this document: [Disaggregated Prefill/Decode in llm-d](./dp.md) +> **Note**: The detailed P/D design is available in this document: [Disaggregated Prefill/Decode in llm-d](./pd_disagg.md) --- ## InferencePool & InferenceModel Design diff --git a/docs/dp.md b/docs/pd_disagg.md similarity index 100% rename from docs/dp.md rename to docs/pd_disagg.md