You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-1Lines changed: 2 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,7 +26,8 @@ Please register [here](https://lu.ma/ygxbpzhl) and join us!
26
26
---
27
27
28
28
*Latest News* 🔥
29
-
-[2023/12] Added ROCm support to vLLM.
29
+
-[2024/01] Added ROCm 6.0 support to vLLM.
30
+
-[2023/12] Added ROCm 5.7 support to vLLM.
30
31
-[2023/10] We hosted [the first vLLM meetup](https://lu.ma/first-vllm-meetup) in SF! Please find the meetup slides [here](https://docs.google.com/presentation/d/1QL-XPFXiFpDBh86DbEegFXBXFXjix4v032GhShbKf3s/edit?usp=sharing).
31
32
-[2023/09] We created our [Discord server](https://discord.gg/jz7wjKhh6g)! Join us to discuss vLLM and LLM serving! We will also post the latest announcements and updates there.
32
33
-[2023/09] We released our [PagedAttention paper](https://arxiv.org/abs/2309.06180) on arXiv!
@@ -95,6 +100,23 @@ You can build and install vLLM from source:
95
100
96
101
Build a docker image from `Dockerfile.rocm`, and launch a docker container.
97
102
103
+
The `Dokerfile.rocm` is designed to support both ROCm 5.7 and ROCm 6.0 and later versions. It provides flexibility to customize the build of docker image using the following arguments:
104
+
105
+
* `BASE_IMAGE`: specifies the base image used when running ``docker build``, specifically the PyTorch on ROCm base image. We have tested ROCm 5.7 and ROCm 6.0. The default is `rocm/pytorch:rocm6.0_ubuntu20.04_py3.9_pytorch_2.1.1`
106
+
* `FX_GFX_ARCHS`: specifies the GFX architecture that is used to build flash-attention, for example, `gfx90a;gfx942` for MI200 and MI300. The default is `gfx90a;gfx942`
107
+
* `FA_BRANCH`: specifies the branch used to build the flash-attention in `ROCmSoftwarePlatform's flash-attention repo <https://github.com/ROCmSoftwarePlatform/flash-attention>`_. The default is `3d2b6f5`
108
+
109
+
Their values can be passed in when running ``docker build`` with ``--build-arg`` options.
110
+
111
+
For example, to build docker image for vllm on ROCm 5.7, you can run:
To build vllm on ROCm 6.0, you can use the default:
119
+
98
120
.. code-block:: console
99
121
100
122
$ docker build -f Dockerfile.rocm -t vllm-rocm .
@@ -142,3 +164,8 @@ Alternatively, if you plan to install vLLM-ROCm on a local machine or start from
142
164
$ cd vllm
143
165
$ pip install -U -r requirements-rocm.txt
144
166
$ python setup.py install # This may take 5-10 minutes.
167
+
168
+
.. note::
169
+
170
+
- You may need to turn on the ``--enforce-eager`` flag if you experience process hang when running the `benchmark_thoughput.py` script to test your installation.
0 commit comments