vllm-project · ApostaC · Feb 7, 2025 · Feb 6, 2025 · Feb 7, 2025 · Feb 7, 2025
diff --git a/README.md b/README.md
@@ -12,6 +12,15 @@
 - 💻 Monitor the  through a web dashboard
 - 😄 Enjoy the performance benefits brought by request routing and KV cache offloading
 
+## Step-By-Step Tutorials
+
+0. How To [*Install Kubernetes (kubectl, helm, minikube, etc)*](https://github.com/vllm-project/production-stack/blob/main/tutorials/00-install-kubernetes-env.md)?
+1. How To [*Setup a Minimal vLLM Production Stack*](https://github.com/vllm-project/production-stack/blob/main/tutorials/01-minimal-helm-installation.md)?
+2. How To [*Customize vLLM Configs (optional)*](https://github.com/vllm-project/production-stack/blob/main/tutorials/02-basic-vllm-config.md)?
+3. How to [*Load Your LLM Weights*](https://github.com/vllm-project/production-stack/blob/main/tutorials/03-load-model-from-pv.md)?
+4. How to [*Launch Different LLMs in vLLM Production Stack*](https://github.com/vllm-project/production-stack/blob/main/tutorials/04-launch-multiple-model.md)?
+5. How to [*Enable KV Cache Offloading with LMCache*](https://github.com/vllm-project/production-stack/blob/main/tutorials/05-offload-kv-cache.md)?
+
 ## Architecture
 
 The stack is set up using [Helm](https://helm.sh/docs/), and contains the following key parts:

diff --git a/tutorials/01-minimal-helm-installation.md b/tutorials/01-minimal-helm-installation.md
@@ -48,6 +48,8 @@ servingEngineSpec:
     requestMemory: "16Gi"
     requestGPU: 1
 
+# set replicaCount to 2 or more to set up multiple vLLM instances
+
     pvcStorage: "10Gi"
 ```