Skip to content

Commit 89276c5

Browse files
authored
[Example] Openvino GenAI Example (#185)
Signed-off-by: LFsWang <[email protected]>
1 parent eee8c0d commit 89276c5

File tree

3 files changed

+155
-0
lines changed

3 files changed

+155
-0
lines changed

openvinogenai-raw/README.md

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# Deepseek example with WasmEdge WASI-NN OpenVINO GenAI plugin
2+
3+
This example demonstrates how to use WasmEdge WASI-NN OpenVINO GenAI plugin to perform an inference task with Deepseek model.
4+
5+
## Set up the environment
6+
7+
- Install `rustup` and `Rust`
8+
9+
Go to the [official Rust webpage](https://www.rust-lang.org/tools/install) and follow the instructions to install `rustup` and `Rust`.
10+
11+
> It is recommended to use Rust 1.68 or above in the stable channel.
12+
13+
Then, add `wasm32-wasi` target to the Rustup toolchain:
14+
15+
```bash
16+
rustup target add wasm32-wasi
17+
```
18+
19+
- Clone the example repo
20+
21+
```bash
22+
git clone https://github.com/second-state/WasmEdge-WASINN-examples.git
23+
```
24+
25+
- Install OpenVINO GenAI
26+
27+
Please refer to [WasmEdge Docs](https://wasmedge.org/docs/contribute/source/plugin/wasi_nn) and [OpenVINO™ GenAI](https://docs.openvino.ai/2025/get-started/install-openvino/install-openvino-genai.html) for the installation process.
28+
29+
```bash
30+
# ensure OpenVINO environment initialized
31+
source setupvars.sh
32+
```
33+
34+
- Build WasmEdge with Wasi-NN OpenVINO GenAI plugin from source
35+
36+
```bash
37+
docker run -v $(pwd):/code -it --rm wasmedge/wasmedge:ubuntu-build-clang-plugins-deps bash
38+
cd /code
39+
cmake -Bbuild -GNinja -DWASMEDGE_PLUGIN_WASI_NN_BACKEND=openvinogenai .
40+
```
41+
42+
## Build and run `openvinogenai-deepseek-raw` example
43+
44+
- Download `Deepseek` model file ([huggingface](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B))
45+
46+
- Convert the model by optimum-intel
47+
48+
```bash
49+
python3 -m venv .venv
50+
. .venv/bin/activate
51+
52+
pip install --upgrade --upgrade-strategy eager "optimum[openvino]"
53+
54+
optimum-cli export openvino --model deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --task text-generation-with-past --weight-format int4 --group-size 128 --ratio 1.0 --sym DeepSeek-R1-Distill-Qwen-1.5B/INT4_compressed_weights
55+
```
56+
57+
- Adjust the DeepSeek chat template
58+
59+
OpenVINO GenAI may not accept the default `chat_template` in `openvino_tokienizer.xml`. Replace it with a valid template:
60+
61+
```xml
62+
<rt_info>
63+
<add_attention_mask value="True" />
64+
<add_prefix_space />
65+
<add_special_tokens value="True" />
66+
<bos_token_id value="151646" />
67+
<chat_template value="... /*replace this*/" />
68+
```
69+
70+
You can refer this information and use the template in ```llm_config.py``` : [Openvino: LLM reasoning with DeepSeek-R1 distilled models](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/deepseek-r1)
71+
72+
- Build the example
73+
74+
```bash
75+
cargo build --target wasm32-wasi --release
76+
```
77+
78+
- Run the example
79+
80+
```bash
81+
wasmedge ./target/wasm32-wasi/release/openvinogenai-deepseek-raw.wasm path_to_model_xml_folder
82+
```
83+
84+
You will get the output:
85+
86+
```console
87+
Load graph ...done
88+
Init execution context ...done
89+
Set input tensor ...done
90+
Generating ...done
91+
Get the result ...Retrieve the output ...done
92+
The size of the output buffer is 285 bytes
93+
Output: I'm a student, and I need to solve this problem: Given a function f(x) = x^3 + 3x^2 + 3x + 1, and a function g(x) = x^2 + 2x + 1. I need to find the number of real roots of f(x) and g(x). Also, I need to find the number of real roots of f(x) + g(x). Please explain step by step. I'm a
94+
done
95+
```

openvinogenai-raw/rust/Cargo.toml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
[package]
2+
name = "openvinogenai-deepseek-raw"
3+
version = "0.1.0"
4+
authors = ["Second-State"]
5+
readme = "README.md"
6+
edition = "2021"
7+
publish = false
8+
9+
[dependencies]
10+
serde_json = "1.0"
11+
wasmedge-wasi-nn = {git = "https://github.com/LFsWang/wasmedge-wasi-nn", branch = "ggml"}
12+
hound = "3.4"
13+
14+
[workspace]

openvinogenai-raw/rust/src/main.rs

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
use std::env;
2+
use std::fs;
3+
use std::io::{self, Write};
4+
use wasmedge_wasi_nn;
5+
use wasmedge_wasi_nn::{ExecutionTarget, GraphBuilder, GraphEncoding, TensorType};
6+
7+
pub fn main() -> Result<(), Box<dyn std::error::Error>> {
8+
let args: Vec<String> = env::args().collect();
9+
let model_type: &str = "LLMPipeline";
10+
let model_xml_path: &str = &args[1];
11+
let plugin_config: &str = "";
12+
13+
print!("Load graph ...");
14+
let graph = GraphBuilder::new(GraphEncoding::OpenvinoGenAI, ExecutionTarget::CPU)
15+
.build_from_bytes([model_type, model_xml_path, plugin_config])?;
16+
println!("done");
17+
18+
print!("Init execution context ...");
19+
let mut context = graph.init_execution_context()?;
20+
println!("done");
21+
22+
print!("Set input tensor ...");
23+
let input_dims = vec![1];
24+
let tensor_data = "Hello, how are you?".as_bytes().to_vec();
25+
context.set_input(0, TensorType::U8, &input_dims, tensor_data)?;
26+
println!("done");
27+
28+
print!("Generating ...");
29+
context.compute()?;
30+
println!("done");
31+
32+
print!("Get the result ...");
33+
print!("Retrieve the output ...");
34+
// Copy output to abuffer.
35+
let mut output_buffer = vec![0u8; 1001];
36+
let size_in_bytes = context.get_output(0, &mut output_buffer)?;
37+
println!("done");
38+
println!("The size of the output buffer is {} bytes", size_in_bytes);
39+
40+
let string_output = String::from_utf8(output_buffer.iter().map(|&c| c as u8).collect()).unwrap();
41+
println!("Output: {}", string_output);
42+
println!("done");
43+
io::stdout().flush().unwrap();
44+
45+
Ok(())
46+
}

0 commit comments

Comments
 (0)