Skip to content

Commit 4f13674

Browse files
authored
Add Llama xnnpack recipe (#15167)
1 parent 3ccb6ab commit 4f13674

File tree

1 file changed

+17
-0
lines changed

1 file changed

+17
-0
lines changed
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
base:
2+
metadata: '{"get_bos_id":128000, "get_eos_ids":[128009, 128001]}'
3+
4+
model:
5+
use_sdpa_with_kv_cache: True
6+
use_kv_cache: True
7+
dtype_override: fp32
8+
9+
quantization:
10+
qmode: 8da4w
11+
group_size: 128
12+
embedding_quantize: 4,32
13+
14+
backend:
15+
xnnpack:
16+
enabled: True
17+
extended_ops: True

0 commit comments

Comments
 (0)