Skip to content

Commit 76e788c

Browse files
Add Podman + Wasm + GPU section (#223)
1 parent 08bb144 commit 76e788c

File tree

1 file changed

+129
-0
lines changed

1 file changed

+129
-0
lines changed
Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
---
2+
sidebar_position: 6
3+
---
4+
5+
# Podman + WASM + GPU
6+
7+
Podman + Crun with Wasmedge + [CDI](https://github.com/cncf-tags/container-device-interface) to enable the usage of host GPU devices. Most of the steps are the same with [docker + wasm + gpu](./docker_wasm_gpu.md), except for the installation of Podman and execution command. If the following steps have already been executed before, you could just skip them.
8+
9+
## Prerequisite
10+
11+
Before we start, you need
12+
13+
- GPU device (Here we will take NVIDIA graphics cards as our example and we have only conducted tests on NVIDIA GPUs on linux for now)
14+
- Install NVIDIA GPU Driver
15+
- Install either the NVIDIA Container Toolkit or you installed the nvidia-container-toolkit-base package.
16+
- Podman >= 4.x
17+
18+
Regarding the installation of the NVIDIA driver and toolkit, we won't go into detail here, but we could provide a few reference documents and the ways to verify your environment is ok.
19+
20+
[Nvidia drivers installation on ubuntu](https://ubuntu.com/server/docs/nvidia-drivers-installation), [Toolkit install guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html), [Nvidia CDI supoort reference](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html)
21+
22+
```bash
23+
# check your driver and device
24+
> nvidia-smi -L
25+
26+
# Check your toolkit
27+
> nvidia-ctk --version
28+
```
29+
30+
Install podman >= 4.0
31+
32+
The current testing phase involves directly installing Podman from Linuxbrew to meet version requirements. There may be more elegant methods in the future, and we will update the documentation accordingly.
33+
34+
```bash
35+
> brew install podman
36+
37+
# Check your podman version and you could add it to your $PATH, too.
38+
> $HOME/.linuxbrew/opt/podman/bin/podman --version
39+
```
40+
41+
42+
## CDI setup
43+
44+
[Generate the CDI specification file](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html#procedure)
45+
46+
```bash
47+
> sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
48+
49+
# Check you cdi config is good
50+
> nvidia-ctk cdi list
51+
52+
# Example output
53+
INFO[0000] Found 2 CDI devices
54+
nvidia.com/gpu=0
55+
nvidia.com/gpu=all
56+
```
57+
58+
## Setup your container runtime (crun + wasmedge + plugin system)
59+
60+
Build crun with wasmedge and plugin system both enable
61+
62+
```bash
63+
> sudo apt install -y make git gcc build-essential pkgconf libtool libsystemd-dev libprotobuf-c-dev libcap-dev libseccomp-dev libyajl-dev go-md2man libtool autoconf python3 automake
64+
65+
> git clone -b enable-wasmedge-plugin https://github.com/second-state/crun
66+
> cd crun
67+
> ./autogen.sh
68+
> ./configure --with-wasmedge
69+
> make
70+
71+
# Check your crun
72+
> ./crun --version
73+
```
74+
75+
Download ggml plugin into host
76+
```bash
77+
> curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash -s -- --plugins wasi_nn-ggml
78+
79+
# Make sure all your plugin dependencies is good
80+
> ldd ~/.wasmedge/plugin/libwasmedgePluginWasiNN.so
81+
```
82+
83+
## Demo llama with our wasm application
84+
85+
> The demo image is built the Wasm application from [here](https://github.com/second-state/WasmEdge-WASINN-examples/tree/master/wasmedge-ggml/llama), and upload it to [here](https://github.com/captainvincent/runwasi/pkgs/container/runwasi-demo/195178675?tag=wasmedge-ggml-llama).
86+
87+
Download inference model
88+
```bash
89+
> curl -LO https://huggingface.co/second-state/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_M.gguf
90+
```
91+
92+
Podman run llama2 inference
93+
94+
> You need to replace the `<podman path>` and `<crun path>` with your binary path in the following command.
95+
96+
```bash
97+
sudo <podman path> run -v ~/.wasmedge/plugin/libwasmedgePluginWasiNN.so:/.wasmedge/plugin/libwasmedgePluginWasiNN.so \
98+
-v /usr/local/cuda/targets/x86_64-linux/lib/libcudart.so.12:/lib/x86_64-linux-gnu/libcudart.so.12 \
99+
-v /usr/local/cuda/targets/x86_64-linux/lib/libcublas.so.12:/lib/x86_64-linux-gnu/libcublas.so.12 \
100+
-v /usr/local/cuda/targets/x86_64-linux/lib/libcublasLt.so.12:/lib/x86_64-linux-gnu/libcublasLt.so.12 \
101+
-v /lib/x86_64-linux-gnu/libcuda.so.1:/lib/x86_64-linux-gnu/libcuda.so.1 \
102+
-v .:/resource \
103+
--env WASMEDGE_PLUGIN_PATH=/.wasmedge/plugin \
104+
--env WASMEDGE_WASINN_PRELOAD=default:GGML:AUTO:/resource/llama-2-7b-chat.Q5_K_M.gguf \
105+
--env n_gpu_layers=100 \
106+
--rm --device nvidia.com/gpu=all --runtime <crun path> --annotation module.wasm.image/variant=compat-smart --platform wasip1/wasm \
107+
ghcr.io/captainvincent/runwasi-demo:wasmedge-ggml-llama default \
108+
$'[INST] <<SYS>>\nYou are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you do not know the answer to a question, please do not share false information.\n<</SYS>>\nWhat is the capital of Japan?[/INST]'
109+
```
110+
111+
Example Result
112+
```
113+
ggml_init_cublas: GGML_CUDA_FORCE_MMQ: no
114+
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
115+
ggml_init_cublas: found 1 CUDA devices:
116+
Device 0: NVIDIA GeForce GTX 1080, compute capability 6.1, VMM: yes
117+
Prompt:
118+
[INST] <<SYS>>
119+
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you do not know the answer to a question, please do not share false information.
120+
<</SYS>>
121+
What is the capital of Japan?[/INST]
122+
Response:
123+
[INFO] llama_commit: "4ffcdce2"
124+
[INFO] llama_build_number: 2334
125+
[INFO] Number of input tokens: 140
126+
Thank you for your kind request! The capital of Japan is Tokyo. I'm glad to help! Please let me know if you have any other questions.
127+
[INFO] Number of input tokens: 140
128+
[INFO] Number of output tokens: 34
129+
```

0 commit comments

Comments
 (0)