You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This example demonstrates using WasmEdge WASI-NN MLX plugin to perform an inference task with LLM model.
4
+
5
+
## Supported Models
6
+
7
+
| Family | Models |
8
+
|--------|--------|
9
+
| LLaMA 2 | llama_2_7b_chat_hf |
10
+
| LLaMA 3 | llama_3_8b |
11
+
| TinyLLaMA | tiny_llama_1.1B_chat_v1.0 |
12
+
13
+
## Install WasmEdge with WASI-NN MLX plugin
14
+
15
+
The MLX backend relies on [MLX](https://github.com/ml-explore/mlx), but we will auto-download MLX when you build WasmEdge. You do not need to install it yourself. If you want to custom MLX, install it yourself or set the `CMAKE_PREFIX_PATH` variable when configuring cmake.
# For the WASI-NN plugin, you should install this project.
26
+
cmake --install build
27
+
```
28
+
29
+
Then you will have an executable `wasmedge` runtime under `/usr/local/bin` and the WASI-NN with MLX backend plug-in under `/usr/local/lib/wasmedge/libwasmedgePluginWasiNN.so` after installation.
30
+
31
+
## Download the model and tokenizer
32
+
33
+
In this example, we will use `tiny_llama_1.1B_chat_v1.0`, which you can change to `llama_2_7b_chat_hf` or `llama_3_8b`.
let prompt = "Once upon a time, there existed a little girl,";
25
-
26
-
let graph = GraphBuilder::new(GraphEncoding::Mlx,ExecutionTarget::AUTO)
27
-
.config(serde_json::to_string(&json!({"tokenizer":tokenizer_path})).expect("Failed to serialize options"))
22
+
let tokenizer_path = "./tokenizer.json";
23
+
let prompt = "Once upon a time, there existed a little girl,";
24
+
let args:Vec<String> = env::args().collect();
25
+
let model_name:&str = &args[1];
26
+
let graph = GraphBuilder::new(GraphEncoding::Mlx,ExecutionTarget::AUTO)
27
+
.config(serde_json::to_string(&json!({"model_type":"tiny_llama_1.1B_chat_v1.0","tokenizer":tokenizer_path,"max_token":100})).expect("Failed to serialize options"))
0 commit comments