Commit bfb5167
committed
Fix: Correct device placement for QuantizedLinear in AWQ
Addresses an AttributeError in AWQ quantization where QuantizedLinear,
an nn.Module, was incorrectly passed to move_to_device, which expects
a tensor. This change ensures QuantizedLinear modules are moved to the
target device using the correct .to(device) method.
Additionally, this commit includes updates to the documentation:
- Docs for AWQ quantization were updated to include parameters like scale_dtype, enable_mnn_kernel, and batch_size.
- Clarified inference procedures for AWQ-quantized models.
- README.md was updated to list AWQ as a supported method and the roadmap was revised.1 parent 8e517be commit bfb5167
File tree
3 files changed
+14
-6
lines changed- docs/api_reference
- quantllm/quant
3 files changed
+14
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| |||
76 | 76 | | |
77 | 77 | | |
78 | 78 | | |
79 | | - | |
| 79 | + | |
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
173 | 173 | | |
174 | 174 | | |
175 | 175 | | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
176 | 179 | | |
177 | 180 | | |
178 | 181 | | |
| |||
260 | 263 | | |
261 | 264 | | |
262 | 265 | | |
| 266 | + | |
| 267 | + | |
263 | 268 | | |
264 | | - | |
265 | | - | |
266 | | - | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
267 | 275 | | |
268 | 276 | | |
269 | 277 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
192 | 192 | | |
193 | 193 | | |
194 | 194 | | |
195 | | - | |
| 195 | + | |
196 | 196 | | |
197 | 197 | | |
198 | 198 | | |
| |||
0 commit comments