We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 67e693b commit 8a96432Copy full SHA for 8a96432
benchmarks/yaml/GLM45-air-32k-bf16.yaml
@@ -0,0 +1,5 @@
1
+max_model_len: 32768
2
+max_num_seqs: 128
3
+tensor_parallel_size: 4
4
+use_cudagraph: True
5
+load_choices: "default_v1"
benchmarks/yaml/GLM45-air-32k-wfp8afp8.yaml
@@ -0,0 +1,6 @@
6
+quantization: wfp8afp8
benchmarks/yaml/request_yaml/GLM-32k.yaml
@@ -0,0 +1,8 @@
+top_p: 0.95
+temperature: 0.6
+metadata:
+ min_tokens: 1
+max_tokens: 12288
+repetition_penalty: 1.0
7
+frequency_penalty: 0
8
+presence_penalty: 0
0 commit comments