Skip to content

Compressed Tensors v0.11.0

Compare
Choose a tag to compare
@dhuangnm dhuangnm released this 19 Aug 19:02
· 46 commits to main since this release
b78eb8f

What's Changed

  • Fix nightly issues with python 3.11 by @dhuangnm in #371
  • Clean up disk space and restore ubuntu 22.04 runner by @dhuangnm in #373
  • [Transform] Update tests to use conftest file by @kylesayrs in #367
  • [Transform] Hadamard Permutations by @kylesayrs in #329
  • [Transform] Construct on GPU, cache on CPU by @kylesayrs in #352
  • enable code coverage collection and reporting INFERENG-1049 by @derekk-nm in #382
  • Deprecate iter_named_leaf_modules and iter_named_quantizable_modules by @kylesayrs in #381
  • Added support for compression on meta device by @shanjiaz in #376
  • Add torch.float64 as a viable dtype for scales by @eldarkurtic in #379
  • [Transform] apply_transform_config by @kylesayrs in #348
  • [Compression] Fix compression device movement in cases of indexed devices by @kylesayrs in #384
  • Enable code coverage report for nightly tests by @dhuangnm in #388
  • [Bugfix] Only quant-compress modules with weight quantization by @kylesayrs in #387
  • [Transform] Fix config serialization by @kylesayrs in #396
  • [Transform] Do not fuse div operation into hadamard matrices by @kylesayrs in #395
  • [Transform] Implement multi-headed transforms by @kylesayrs in #383
  • Support DeepSeekV3-style block FP8 quantization by @mgoin in #372
  • [Transform] [Utils] Canonical matching utilities by @kylesayrs in #392
  • [Bugfix] Safeguard against submodule parameter deletion in decompress_model by @kylesayrs in #347
  • fix block quantization initialization by @shanjiaz in #403
  • [Utils] Skip internal modules when matching by @kylesayrs in #404
  • [Quantization][Decompression] Fix QDQ for dynamic quant; Update NVFP4 Compression Params by @dsikka in #407
  • [Utils] Support matching vLLM modules by @kylesayrs in #413
  • Fix block size inference logic by @shanjiaz in #411
  • [Transform] Serialize with tied weights by @kylesayrs in #370
  • [Transform] [Utils] Support precision, add torch dtype validation by @kylesayrs in #414
  • [Transform] Serialize transforms config by @kylesayrs in #412
  • Error when configs are created with unrecognized fields by @kylesayrs in #386
  • revert forbid constraint on QuantizationConfig by @brian-dellabetta in #418
  • Revert "[Transform] Serialize transforms config (#412)" by @dsikka in #419
  • added wrapper for execution device by @shanjiaz in #417
  • [Transform] Serialize config (include format) by @dsikka in #420
  • exclude transform_config from quantization_config parse by @brian-dellabetta in #421
  • [Quantization] Support more than one quant-compressor by @dsikka in #415
  • [QuantizationScheme] Validate format by @dsikka in #424
  • [Utils] Expand is_match by @kylesayrs in #416
  • fix match.py syntax by @shanjiaz in #426
  • [Offload] Fully remove dispatch by @kylesayrs in #427

Full Changelog: 0.10.2...0.11.0