Skip to content

Commit 5c064aa

Browse files
grorge123hydai
authored andcommitted
[Example] MLX: add quantization
1 parent 7ee0774 commit 5c064aa

File tree

2 files changed

+10
-1
lines changed

2 files changed

+10
-1
lines changed

wasmedge-mlx/README.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,11 +70,20 @@ wasmedge --dir .:. \
7070

7171
There are some metadata for MLX plugin you can set.
7272

73+
### Basic setting
74+
7375
- model_type (required): LLM model type.
7476
- tokenizer (required): tokenizer.json path
7577
- max_token (option): maximum generate token number, default is 1024.
7678
- enable_debug_log (option): if print debug log, default is false.
7779

80+
### Quantization
81+
82+
The following three parameters need to be set together.
83+
- is_quantized (option): If the weight is quantized. If is_quantized is false, then MLX backend will quantize the weight.
84+
- group_size (option): The group size to use for quantization.
85+
- q_bits (option): The number of bits to quantize to.
86+
7887
``` rust
7988
let graph = GraphBuilder::new(GraphEncoding::Mlx, ExecutionTarget::AUTO)
8089
.config(serde_json::to_string(&json!({"model_type": "tiny_llama_1.1B_chat_v1.0", "tokenizer":tokenizer_path, "max_token":100}))

wasmedge-mlx/src/main.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ fn main() {
2424
let args: Vec<String> = env::args().collect();
2525
let model_name: &str = &args[1];
2626
let graph = GraphBuilder::new(GraphEncoding::Mlx, ExecutionTarget::AUTO)
27-
.config(serde_json::to_string(&json!({"model_type": "tiny_llama_1.1B_chat_v1.0", "tokenizer":tokenizer_path, "max_token":100})).expect("Failed to serialize options"))
27+
.config(serde_json::to_string(&json!({"is_quantized":false, "group_size": 64, "q_bits": 4,"model_type": "tiny_llama_1.1B_chat_v1.0", "tokenizer":tokenizer_path, "max_token":100})).expect("Failed to serialize options"))
2828
.build_from_cache(model_name)
2929
.expect("Failed to build graph");
3030
let mut context = graph

0 commit comments

Comments
 (0)