docs: Add tutorial on deploying vLLM model with KServe (#2586)

terrytangyuan · web-flow · commit 49d849b3ab7a · 2024-03-01T11:04:14.000-08:00
Signed-off-by: Yuan Tang &lt;terrytangyuan@gmail.com&gt;
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -70,6 +70,7 @@ Documentation
 
    serving/distributed_serving
    serving/run_on_sky
+   serving/deploying_with_kserve
    serving/deploying_with_triton
    serving/deploying_with_docker
    serving/serving_with_langchain
diff --git a/docs/source/serving/deploying_with_kserve.rst b/docs/source/serving/deploying_with_kserve.rst
@@ -0,0 +1,8 @@
+.. _deploying_with_kserve:
+
+Deploying with KServe
+============================
+
+vLLM can be deployed with `KServe <https://github.com/kserve/kserve>`_ on Kubernetes for highly scalable distributed model serving.
+
+Please see `this guide <https://kserve.github.io/website/latest/modelserving/v1beta1/llm/vllm/>`_ for more details on using vLLM with KServe.