-
Notifications
You must be signed in to change notification settings - Fork 582
Running on AMD GPU
Due to the great work of Odonata (discord) and the hardware of oceanmasterza (Discord) and the help of epicx (discord), we have the below AMD instructions.
On host machine:
docker pull rocm/pytorch-nightly
sudo docker run -it --network=host --device=/dev/kfd --device=/dev/dri --group-add=video --ipc=host --cap-add=SYS_PTRACE --security-opt seccomp=unconfined rocm/pytorchIn the running image:
cd /home
export HSA_OVERRIDE_GFX_VERSION=10.3.0
# Install bitsandbytes with ROCM support
git clone https://github.com/arlo-phoenix/bitsandbytes-rocm-5.6.git bitsandbytes
cd bitsandbytes
make hip ROCM_TARGET=gfx1030
pip install pip --upgrade
pip install .
# Install Petals
cd ..
pip install --upgrade git+https://github.com/bigscience-workshop/petals
# Run server
python -m petals.cli.run_server stabilityai/StableBeluga2 --port <an open port>Multi-GPU process still not tested (docker "--gpu" flag may not function at this time and other virtualization tools may be necessary)
Contributed by: @edt-xx, @bennmann
Tested on:
- AMD 6600 XT tested July 24th, 2023 on Arch Linux with Rocm 5.6.0, mesa 22.1.4
- AMD 6900 XT tested April 18th, 2023 on bare metal Ubuntu 22.04 (no docker/anaconda/container). Tested with ROCM 5.4.2
- Untested on 7000 series, however 7000s may have much better performance as AMD added machine learning tensor library and better hardware support (vs ray tracing only on 6000 series)
Guide:
-
use the mesa-clover and mesa-rusticl opencl variants
-
add
export HSA_OVERRIDE_GFX_VERSION=10.3.0to your environment (put it to/home/user/.bashrcon ubuntu - this tricks ROCM to work on more consumer based cards like the 6000 series) -
install ROCM. Use this tutorial for Arch Linux: https://wiki.archlinux.org/title/GPGPU
-
create and activate a venv for petals using python 3.11
- python -m venv <yourvenvpath>
- cd <yourvenvpath>
- source bin/activate
-
in the venv install pytorch, nightly version, with the command generated on by the website: https://pytorch.org/get-started/locally/
-
install the Petals version with AMD GPU support:
pip install git+https://github.com/bigscience-workshop/petals@amd-gpus
This branch uses an older version of
bitsandbytespatched to have AMD GPU support (developed by @brontoc and Titaniumtown). This means that you won't be able to use the 4-bit qunatization (--quant_type nf4) and LoRA adapters (the--adaptersargument). The server will use 8-bit quantization (int8) for all models by default.Tip: You can set your fans to full speed or close to it before starting Petals (the default Linux fan profile for AMD GPUs is not good on some cards):
rocm-smi --setfan 99% -
run petals using:
python -m petals.cli.run_server stabilityai/StableBeluga2
Tip: You can monitor temperature and woltage by running this:
rocm-smi && rocm-smi -t