|
| 1 | +# Deepseek example with WasmEdge WASI-NN OpenVINO GenAI plugin |
| 2 | + |
| 3 | +This example demonstrates how to use WasmEdge WASI-NN OpenVINO GenAI plugin to perform an inference task with Deepseek model. |
| 4 | + |
| 5 | +## Set up the environment |
| 6 | + |
| 7 | +- Install `rustup` and `Rust` |
| 8 | + |
| 9 | + Go to the [official Rust webpage](https://www.rust-lang.org/tools/install) and follow the instructions to install `rustup` and `Rust`. |
| 10 | + |
| 11 | + > It is recommended to use Rust 1.68 or above in the stable channel. |
| 12 | +
|
| 13 | + Then, add `wasm32-wasi` target to the Rustup toolchain: |
| 14 | + |
| 15 | + ```bash |
| 16 | + rustup target add wasm32-wasi |
| 17 | + ``` |
| 18 | + |
| 19 | +- Clone the example repo |
| 20 | + |
| 21 | + ```bash |
| 22 | + git clone https://github.com/second-state/WasmEdge-WASINN-examples.git |
| 23 | + ``` |
| 24 | + |
| 25 | +- Install OpenVINO GenAI |
| 26 | + |
| 27 | + Please refer to [WasmEdge Docs](https://wasmedge.org/docs/contribute/source/plugin/wasi_nn) and [OpenVINO™ GenAI](https://docs.openvino.ai/2025/get-started/install-openvino/install-openvino-genai.html) for the installation process. |
| 28 | + |
| 29 | + ```bash |
| 30 | + # ensure OpenVINO environment initialized |
| 31 | + source setupvars.sh |
| 32 | + ``` |
| 33 | + |
| 34 | +- Build WasmEdge with Wasi-NN OpenVINO GenAI plugin from source |
| 35 | + |
| 36 | + ```bash |
| 37 | + docker run -v $(pwd):/code -it --rm wasmedge/wasmedge:ubuntu-build-clang-plugins-deps bash |
| 38 | + cd /code |
| 39 | + cmake -Bbuild -GNinja -DWASMEDGE_PLUGIN_WASI_NN_BACKEND=openvinogenai . |
| 40 | + ``` |
| 41 | + |
| 42 | +## Build and run `openvinogenai-deepseek-raw` example |
| 43 | + |
| 44 | +- Download `Deepseek` model file ([huggingface](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)) |
| 45 | + |
| 46 | +- Convert the model by optimum-intel |
| 47 | + |
| 48 | + ```bash |
| 49 | + python3 -m venv .venv |
| 50 | + . .venv/bin/activate |
| 51 | + |
| 52 | + pip install --upgrade --upgrade-strategy eager "optimum[openvino]" |
| 53 | + |
| 54 | + optimum-cli export openvino --model deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --task text-generation-with-past --weight-format int4 --group-size 128 --ratio 1.0 --sym DeepSeek-R1-Distill-Qwen-1.5B/INT4_compressed_weights |
| 55 | + ``` |
| 56 | + |
| 57 | +- Adjust the DeepSeek chat template |
| 58 | + |
| 59 | + OpenVINO GenAI may not accept the default `chat_template` in `openvino_tokienizer.xml`. Replace it with a valid template: |
| 60 | + |
| 61 | + ```xml |
| 62 | + <rt_info> |
| 63 | + <add_attention_mask value="True" /> |
| 64 | + <add_prefix_space /> |
| 65 | + <add_special_tokens value="True" /> |
| 66 | + <bos_token_id value="151646" /> |
| 67 | + <chat_template value="... /*replace this*/" /> |
| 68 | + ``` |
| 69 | + |
| 70 | + You can refer this information and use the template in ```llm_config.py``` : [Openvino: LLM reasoning with DeepSeek-R1 distilled models](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/deepseek-r1) |
| 71 | + |
| 72 | +- Build the example |
| 73 | + |
| 74 | + ```bash |
| 75 | + cargo build --target wasm32-wasi --release |
| 76 | + ``` |
| 77 | + |
| 78 | +- Run the example |
| 79 | + |
| 80 | + ```bash |
| 81 | + wasmedge ./target/wasm32-wasi/release/openvinogenai-deepseek-raw.wasm path_to_model_xml_folder |
| 82 | + ``` |
| 83 | + |
| 84 | + You will get the output: |
| 85 | + |
| 86 | + ```console |
| 87 | + Load graph ...done |
| 88 | + Init execution context ...done |
| 89 | + Set input tensor ...done |
| 90 | + Generating ...done |
| 91 | + Get the result ...Retrieve the output ...done |
| 92 | + The size of the output buffer is 285 bytes |
| 93 | + Output: I'm a student, and I need to solve this problem: Given a function f(x) = x^3 + 3x^2 + 3x + 1, and a function g(x) = x^2 + 2x + 1. I need to find the number of real roots of f(x) and g(x). Also, I need to find the number of real roots of f(x) + g(x). Please explain step by step. I'm a |
| 94 | + done |
| 95 | + ``` |
0 commit comments