Skip to content

Commit b866cdb

Browse files
authored
[Misc] Add assertion and helpful message for marlin24 compressed models (#11388)
1 parent 2e72668 commit b866cdb

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

vllm/model_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_w4a16_24.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,10 @@ def create_weights(self, layer: torch.nn.Module, input_size: int,
6161
params_dtype: torch.dtype, weight_loader: Callable,
6262
**kwargs):
6363

64+
assert params_dtype == torch.float16, (
65+
"float16 is required for marlin24 compressd models. Set dtype=torch.float16" # noqa: E501
66+
)
67+
6468
pack_factor = 32 // self.quant_type.size_bits
6569
output_size_per_partition = sum(output_partition_sizes)
6670

0 commit comments

Comments
 (0)