Skip to content

Commit a355999

Browse files
committed
Fix VLM
Signed-off-by: Chenjie Luo <[email protected]>
1 parent 0c56584 commit a355999

File tree

2 files changed

+18
-1
lines changed

2 files changed

+18
-1
lines changed

examples/vlm_ptq/scripts/huggingface_example.sh

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,9 @@ if $TRUST_REMOTE_CODE; then
6969
PTQ_ARGS+=" --trust_remote_code "
7070
fi
7171

72+
if [ -n "$KV_CACHE_QUANT" ]; then
73+
PTQ_ARGS+=" --kv_cache_qformat=$KV_CACHE_QUANT "
74+
fi
7275

7376
if [ "${MODEL_TYPE}" = "vila" ]; then
7477
# Install required dependency for VILA
@@ -98,6 +101,20 @@ if [[ $TASKS =~ "quant" ]] || [[ ! -d "$SAVE_PATH" ]] || [[ ! $(ls -A $SAVE_PATH
98101
fi
99102
fi
100103

104+
if [[ "$QFORMAT" != "fp8" ]]; then
105+
echo "For quant format $QFORMAT, please refer to the TensorRT-LLM documentation for deployment. Checkpoint saved to $SAVE_PATH."
106+
exit 0
107+
fi
108+
109+
if [[ "$QFORMAT" == *"nvfp4"* ]] || [[ "$KV_CACHE_QUANT" == *"nvfp4"* ]]; then
110+
cuda_major=$(nvidia-smi --query-gpu=compute_cap --format=csv,noheader -i 0 | cut -d. -f1)
111+
112+
if [ "$cuda_major" -lt 10 ]; then
113+
echo "Please deploy the NVFP4 checkpoint on a Blackwell GPU. Checkpoint export_path: $SAVE_PATH"
114+
exit 0
115+
fi
116+
fi
117+
101118
# Prepare datasets for TRT-LLM benchmark
102119
if [ -z "$TRT_LLM_CODE_PATH" ]; then
103120
TRT_LLM_CODE_PATH=/app/tensorrt_llm # default path for the TRT-LLM release docker image

tests/examples/vlm_ptq/test_qwen_vl.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
from _test_utils.torch_misc import minimum_gpu
2121

2222

23-
@pytest.mark.parametrize("quant", ["fp8"])
23+
@pytest.mark.parametrize("quant", ["fp8", "int8_sq", "nvfp4"])
2424
@minimum_gpu(2)
2525
def test_qwen_vl_multi_gpu(quant):
2626
run_vlm_ptq_command(model=QWEN_VL_PATH, quant=quant)

0 commit comments

Comments
 (0)