Compressed Tensors v0.11.0
What's Changed
- Fix nightly issues with python 3.11 by @dhuangnm in #371
- Clean up disk space and restore ubuntu 22.04 runner by @dhuangnm in #373
- [Transform] Update tests to use conftest file by @kylesayrs in #367
- [Transform] Hadamard Permutations by @kylesayrs in #329
- [Transform] Construct on GPU, cache on CPU by @kylesayrs in #352
- enable code coverage collection and reporting INFERENG-1049 by @derekk-nm in #382
- Deprecate
iter_named_leaf_modules
anditer_named_quantizable_modules
by @kylesayrs in #381 - Added support for compression on meta device by @shanjiaz in #376
- Add torch.float64 as a viable dtype for scales by @eldarkurtic in #379
- [Transform]
apply_transform_config
by @kylesayrs in #348 - [Compression] Fix compression device movement in cases of indexed devices by @kylesayrs in #384
- Enable code coverage report for nightly tests by @dhuangnm in #388
- [Bugfix] Only quant-compress modules with weight quantization by @kylesayrs in #387
- [Transform] Fix config serialization by @kylesayrs in #396
- [Transform] Do not fuse div operation into hadamard matrices by @kylesayrs in #395
- [Transform] Implement multi-headed transforms by @kylesayrs in #383
- Support DeepSeekV3-style block FP8 quantization by @mgoin in #372
- [Transform] [Utils] Canonical matching utilities by @kylesayrs in #392
- [Bugfix] Safeguard against submodule parameter deletion in decompress_model by @kylesayrs in #347
- fix block quantization initialization by @shanjiaz in #403
- [Utils] Skip internal modules when matching by @kylesayrs in #404
- [Quantization][Decompression] Fix QDQ for dynamic quant; Update NVFP4 Compression Params by @dsikka in #407
- [Utils] Support matching vLLM modules by @kylesayrs in #413
- Fix block size inference logic by @shanjiaz in #411
- [Transform] Serialize with tied weights by @kylesayrs in #370
- [Transform] [Utils] Support precision, add torch dtype validation by @kylesayrs in #414
- [Transform] Serialize transforms config by @kylesayrs in #412
- Error when configs are created with unrecognized fields by @kylesayrs in #386
- revert forbid constraint on QuantizationConfig by @brian-dellabetta in #418
- Revert "[Transform] Serialize transforms config (#412)" by @dsikka in #419
- added wrapper for execution device by @shanjiaz in #417
- [Transform] Serialize config (include format) by @dsikka in #420
- exclude transform_config from quantization_config parse by @brian-dellabetta in #421
- [Quantization] Support more than one quant-compressor by @dsikka in #415
- [QuantizationScheme] Validate format by @dsikka in #424
- [Utils] Expand
is_match
by @kylesayrs in #416 - fix match.py syntax by @shanjiaz in #426
- [Offload] Fully remove dispatch by @kylesayrs in #427
Full Changelog: 0.10.2...0.11.0