You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/README.md
+11Lines changed: 11 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,6 +21,17 @@ vLLM is a fast and easy-to-use library for LLM inference and serving.
21
21
22
22
Originally developed in the [Sky Computing Lab](https://sky.cs.berkeley.edu) at UC Berkeley, vLLM has evolved into a community-driven project with contributions from both academia and industry.
23
23
24
+
Where to get started with vLLM depends on the type of user. If you are looking to:
25
+
26
+
- Run open-source models on vLLM, we recommend starting with the [Quickstart Guide](./getting_started/quickstart.md)
27
+
- Build applications with vLLM, we recommend starting with the [User Guide](./usage)
28
+
- Build vLLM, we recommend starting with [Developer Guide](./contributing)
29
+
30
+
For information about the development of vLLM, see:
- If you are using vLLM from within Python code, see [Offline Inference](./offline_inference/)
6
+
- If you are using vLLM from an HTTP application or client, see [Online Serving](./online_serving/)
7
+
- For examples of using some of vLLM's advanced features (e.g. LMCache or Tensorizer) which are not specific to either of the above use cases, see [Others](./others/)
0 commit comments