Skip to content

Commit 73030b7

Browse files
[ Misc ] Enable Quantizing All Layers of DeekSeekv2 (#6423)
1 parent ccd3c04 commit 73030b7

File tree

2 files changed

+6
-1
lines changed

2 files changed

+6
-1
lines changed

.buildkite/lm-eval-harness/run-lm-eval-gsm-vllm-baseline.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,6 @@ while getopts "m:b:l:f:t:" OPT; do
4646
done
4747

4848
lm_eval --model vllm \
49-
--model_args pretrained=$MODEL,tensor_parallel_size=$TP_SIZE,add_bos_token=true,distributed_executor_backend="ray",trust_remote_code=true \
49+
--model_args pretrained=$MODEL,tensor_parallel_size=$TP_SIZE,add_bos_token=true,distributed_executor_backend="ray",trust_remote_code=true,max_model_len=4096 \
5050
--tasks gsm8k --num_fewshot $FEWSHOT --limit $LIMIT \
5151
--batch_size $BATCH_SIZE

vllm/model_executor/model_loader/weight_utils.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -431,6 +431,11 @@ def convert_pyslice_to_tensor(x: Any) -> torch.Tensor:
431431
def default_weight_loader(param: torch.Tensor,
432432
loaded_weight: torch.Tensor) -> None:
433433
"""Default weight loader."""
434+
# If the weight on disk does not have a shape, give it one
435+
# (such scales for AutoFp8).
436+
if len(loaded_weight.shape) == 0:
437+
loaded_weight = loaded_weight.reshape(1)
438+
434439
assert param.size() == loaded_weight.size()
435440
param.data.copy_(loaded_weight)
436441

0 commit comments

Comments
 (0)