Skip to content

Commit 2c43060

Browse files
authored
Update lm-eval AWQ NVFP4 tests (vllm-project#2198)
SUMMARY: - Move the AWQ NVFP4 Recipes to an NVFP4 folder, currently under actorder - Skip the tests currently while we fix the decompression bug
1 parent a756c9c commit 2c43060

File tree

4 files changed

+4
-2
lines changed

4 files changed

+4
-2
lines changed
File renamed without changes.

tests/e2e/vLLM/recipes/actorder/recipe_awq_nvfp4a16.yaml renamed to tests/e2e/vLLM/recipes/NVFP4/recipe_awq_nvfp4a16.yaml

File renamed without changes.

tests/lmeval/configs/awq_nvfp4.yaml renamed to tests/lmeval/skipped_configs/awq_nvfp4.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,11 @@ scheme: NVFP4
44
dataset_id: HuggingFaceH4/ultrachat_200k
55
dataset_split: train_sft
66
num_calibration_samples: 20
7-
recipe: tests/e2e/vLLM/recipes/actorder/recipe_awq_nvfp4.yaml
7+
recipe: tests/e2e/vLLM/recipes/NVFP4/recipe_awq_nvfp4.yaml
88
lmeval:
99
# NVFP4 (4-bit weights + 4-bit activations) has lower recovery than FP8/INT8
1010
# Observed: strict-match ~92.81%, flexible-extract ~89.59%
11+
# TODO: check if recovery is consistent - 0.65 is too low for 0.94 recovery
1112
recovery_threshold:
1213
exact_match,strict-match: 0.92
1314
exact_match,flexible-extract: 0.88

tests/lmeval/configs/awq_nvfp4a16.yaml renamed to tests/lmeval/skipped_configs/awq_nvfp4a16.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,11 @@ scheme: NVFP4
44
dataset_id: HuggingFaceH4/ultrachat_200k
55
dataset_split: train_sft
66
num_calibration_samples: 20
7-
recipe: tests/e2e/vLLM/recipes/actorder/recipe_awq_nvfp4a16.yaml
7+
recipe: tests/e2e/vLLM/recipes/NVFP4/recipe_awq_nvfp4a16.yaml
88
lmeval:
99
# NVFP4 (4-bit weights + 4-bit activations) has lower recovery than FP8/INT8
1010
# Observed: strict-match ~92.81%, flexible-extract ~89.59%
11+
# TODO: check if recovery is consistent - 0.65 is too low for 0.94 recovery
1112
recovery_threshold:
1213
exact_match,strict-match: 0.95
1314
exact_match,flexible-extract: 0.94

0 commit comments

Comments
 (0)