We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 3ccb6ab commit 4f13674Copy full SHA for 4f13674
examples/models/llama/config/llama_xnnpack.yaml
@@ -0,0 +1,17 @@
1
+base:
2
+ metadata: '{"get_bos_id":128000, "get_eos_ids":[128009, 128001]}'
3
+
4
+model:
5
+ use_sdpa_with_kv_cache: True
6
+ use_kv_cache: True
7
+ dtype_override: fp32
8
9
+quantization:
10
+ qmode: 8da4w
11
+ group_size: 128
12
+ embedding_quantize: 4,32
13
14
+backend:
15
+ xnnpack:
16
+ enabled: True
17
+ extended_ops: True
0 commit comments