Skip to content

Releases: neuralmagic/compressed-tensors

Compressed Tensors v0.12.2

07 Oct 14:49
2dd1b62
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 0.12.1...0.12.2

Compressed Tensors v0.12.1

02 Oct 18:53
8206c93
Compare
Choose a tag to compare

What's Changed

  • [Patch Fix] Add get_missing_module_keys to support transformers lower bound by @dsikka in #479
  • [cicd] Add post-release run to push next nightly by @dbarbuzzi in #478

Full Changelog: 0.12.0...0.12.1

Compressed Tensors v0.12.0

01 Oct 16:30
0d8c7c3
Compare
Choose a tag to compare

What's Changed

  • Refactor module / parameter matching logic by @fynnsu in #406
  • Revert "Refactor module / parameter matching logic (#406)" by @kylesayrs in #429
  • Refactor module / parameter matching logic by @fynnsu in #431
  • Add quality check to CI and fix existing errors by @fynnsu in #408
  • Speed up nvfp4 pack/unpack w/ torch.compile by @fynnsu in #400
  • Simplify apply_quantization_config by @kylesayrs in #433
  • Install compilers etc to fix nightly test failure by @dhuangnm in #435
  • Fix a minor bug on GH hosted runners by @dhuangnm in #438
  • remove references to llmcompressor.transformers.oneshot in examples by @brian-dellabetta in #422
  • [Tests] Combine quantization and dequantization tests by @kylesayrs in #443
  • fix compress on meta device issue by @shanjiaz in #444
  • Throw error for unsupported activation strategies by @kylesayrs in #446
  • [Transform] Better dispatch support for offloaded and multi-gpu by @kylesayrs in #423
  • [Quantization Format] Add functionality to infer format by @dsikka in #441
  • Revert "[Quantization Format] Add functionality to infer format (#441)" by @dsikka in #451
  • Raise ValueError when nvfp4 pack tensor has odd number of columns by @fynnsu in #402
  • [Quantization] Allow dynamic group activation quantization by @kylesayrs in #450
  • Fix lint error on main by @fynnsu in #460
  • [Accelerate] Remove is_module_offloaded and update_prefix_dict by @kylesayrs in #366
  • [Decompression] Clean-up and some fixes by @dsikka in #461
  • [ModelCompressor] Remove missing keys and missing modules by @dsikka in #462
  • [Logging] Support use of loguru by @kylesayrs in #454
  • [Utils] Deprecate safe_permute by @kylesayrs in #464
  • [Quantization Format] Add functionality to infer format by @dsikka in #452
  • [licensing refactor] remove frozendict dependency, use types.MappingProxyType instead by @brian-dellabetta in #469
  • [Transform] Support loading random hadamards on meta device by @kylesayrs in #445
  • [transforms] TransformScheme.block_size, deprecate head_dim by @brian-dellabetta in #466
  • [Multi-Modifier] Scoped apply quantization config by @brian-dellabetta in #432
  • [Model Compressor] Move infer call to from_pretrained_model method by @dsikka in #470
  • Always save g_idx when initialized in quantization compressor by @rahul-tuli in #467
  • Add back get unexpected keys to support transformers lower bound by @dsikka in #475
  • Improve Hugging Face API utilization in tests by @dbarbuzzi in #473
  • [Transform] Revert deprecation of TransformScheme.head_dim for compatibility with vllm by @brian-dellabetta in #472
  • [cicd] Include Python version in artifact name by @dbarbuzzi in #477

New Contributors

Full Changelog: 0.11.0...0.12.0

Compressed Tensors v0.11.0

19 Aug 19:02
b78eb8f
Compare
Choose a tag to compare

What's Changed

  • Fix nightly issues with python 3.11 by @dhuangnm in #371
  • Clean up disk space and restore ubuntu 22.04 runner by @dhuangnm in #373
  • [Transform] Update tests to use conftest file by @kylesayrs in #367
  • [Transform] Hadamard Permutations by @kylesayrs in #329
  • [Transform] Construct on GPU, cache on CPU by @kylesayrs in #352
  • enable code coverage collection and reporting INFERENG-1049 by @derekk-nm in #382
  • Deprecate iter_named_leaf_modules and iter_named_quantizable_modules by @kylesayrs in #381
  • Added support for compression on meta device by @shanjiaz in #376
  • Add torch.float64 as a viable dtype for scales by @eldarkurtic in #379
  • [Transform] apply_transform_config by @kylesayrs in #348
  • [Compression] Fix compression device movement in cases of indexed devices by @kylesayrs in #384
  • Enable code coverage report for nightly tests by @dhuangnm in #388
  • [Bugfix] Only quant-compress modules with weight quantization by @kylesayrs in #387
  • [Transform] Fix config serialization by @kylesayrs in #396
  • [Transform] Do not fuse div operation into hadamard matrices by @kylesayrs in #395
  • [Transform] Implement multi-headed transforms by @kylesayrs in #383
  • Support DeepSeekV3-style block FP8 quantization by @mgoin in #372
  • [Transform] [Utils] Canonical matching utilities by @kylesayrs in #392
  • [Bugfix] Safeguard against submodule parameter deletion in decompress_model by @kylesayrs in #347
  • fix block quantization initialization by @shanjiaz in #403
  • [Utils] Skip internal modules when matching by @kylesayrs in #404
  • [Quantization][Decompression] Fix QDQ for dynamic quant; Update NVFP4 Compression Params by @dsikka in #407
  • [Utils] Support matching vLLM modules by @kylesayrs in #413
  • Fix block size inference logic by @shanjiaz in #411
  • [Transform] Serialize with tied weights by @kylesayrs in #370
  • [Transform] [Utils] Support precision, add torch dtype validation by @kylesayrs in #414
  • [Transform] Serialize transforms config by @kylesayrs in #412
  • Error when configs are created with unrecognized fields by @kylesayrs in #386
  • revert forbid constraint on QuantizationConfig by @brian-dellabetta in #418
  • Revert "[Transform] Serialize transforms config (#412)" by @dsikka in #419
  • added wrapper for execution device by @shanjiaz in #417
  • [Transform] Serialize config (include format) by @dsikka in #420
  • exclude transform_config from quantization_config parse by @brian-dellabetta in #421
  • [Quantization] Support more than one quant-compressor by @dsikka in #415
  • [QuantizationScheme] Validate format by @dsikka in #424
  • [Utils] Expand is_match by @kylesayrs in #416
  • fix match.py syntax by @shanjiaz in #426
  • [Offload] Fully remove dispatch by @kylesayrs in #427

Full Changelog: 0.10.2...0.11.0

Compressed Tensors v0.10.2

23 Jun 13:20
38cbdd1
Compare
Choose a tag to compare

What's Changed

  • [Hotfix] Implement quantization compressor methods on dense compressor by @kylesayrs in #344
  • [Hotfix] Implement method on dense compressor by @kylesayrs in #345
  • [Transform] Factory classes with shared memory and offloading by @kylesayrs in #316
  • [Transform] [Bugfix] Fix enum value serialization in python>=3.11 by @kylesayrs in #350
  • Remove redundant call by @eldarkurtic in #349
  • [Accelerate] Rename and simplify force_cpu_offload by @kylesayrs in #354
  • [Transform] Extend set of known Hadamard matrices by @kylesayrs in #351
  • [Accelerate] Fix offloaded_dispatch, implement disable_offloading by @kylesayrs in #355
  • [Accelerate] Extend functionality of register_offload_parameter by @kylesayrs in #356
  • [Bugfix] Fix saving of models dispatched by offloaded_dispatch by @kylesayrs in #357
  • [Bugfix] Only update direct params in disable_offloading by @kylesayrs in #360
  • reference updated reportportal_submit_execution_results action by @derekk-nm in #362
  • [Accelerate] Expand get_execution_device to support models by @kylesayrs in #363
  • [Accelerate] Fix typos in get_execution_device by @kylesayrs in #365

New Contributors

Full Changelog: 0.10.1...0.10.2

Compressed Tensors v0.10.1

06 Jun 18:26
f5dbfc3
Compare
Choose a tag to compare

What's Changed

  • [Transform] Hadamard and Matrix Transform Utils by @kylesayrs in #330
  • Fix error on import whenever accelerate is absent by @maresb in #342

New Contributors

Full Changelog: 0.10.0...0.10.1

Compressed Tensors v0.10.0

05 Jun 17:51
d7ce8ec
Compare
Choose a tag to compare

What's Changed

  • Updates to build system by @dbarbuzzi in #304
  • [Utils] add align_modules by @kylesayrs in #282
  • Enable module state_dict compression, simplify compression logic by @kylesayrs in #302
  • Fix _initialize_scale_zero_point initializing on the wrong device by @mgoin in #295
  • Revert "Enable module state_dict compression, simplify compression lo… by @kylesayrs in #306
  • [Bugfix] Fix shape calculation for group quantization by @kylesayrs in #308
  • Enable module state_dict compression, simplify compression logic by @kylesayrs in #307
  • Clarify decompression return type by @kylesayrs in #310
  • Clarify match_param_name return type by @kylesayrs in #312
  • [Compressor][NVFP4] Support FP4 Compression by @dsikka in #311
  • [NVFP4] Update FloatArgs and NVFP4 by @dsikka in #313
  • fix signatures on model_validator functions by @brian-dellabetta in #314
  • [Performance] Add memory compression and decompression pathways by @kylesayrs in #301
  • Model Compression: Set compression status by @kylesayrs in #318
  • [NVFP4] Enable Fp4 Quantization; introduce / apply global_scales by @dsikka in #315
  • [NVFP4] Skip fused global scale calculation if already fused by @dsikka in #322
  • Update default observer to be MSE by @shanjiaz in #300
  • [Misc] Generics typehinting for RegistryMixin by @kylesayrs in #320
  • Revert "Update default observer to be MSE (#300)" by @dsikka in #323
  • [NVFP4] Add tensor_group strategy; enable NVFP4 Activations by @dsikka in #317
  • [Transforms] Transform Args, Scheme, and Config by @kylesayrs in #321
  • [NVFP4] Expand dynamic types, clean-up conditions by @dsikka in #325
  • Use different runner for UPLOAD job by @dbarbuzzi in #327
  • [NVFP4] Use torch.compile when rounding to NVFP4 by @dsikka in #331
  • [Tests] Update test_fp8_quant.py by @dsikka in #337
  • [Tests] Fix test scale init for group quant by @dsikka in #338
  • [Quantization] Update group quantization by @dsikka in #336
  • [NVFP4] update global scale generation by @dsikka in #339
  • [Transform] Accelerate Utilities by @kylesayrs in #328
  • Model Compression: Delete offload by @kylesayrs in #319
  • [Decompression] Keep unused parameters when decompressing from memory by @kylesayrs in #340
  • [NVFP4] Small Nits by @dsikka in #341

New Contributors

Full Changelog: 0.9.4...0.10.0

Compressed Tensors v0.9.4

24 Apr 19:21
8aa8b82
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.9.3...0.9.4

Compressed Tensors v0.9.3

02 Apr 17:13
4574747
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.9.2...0.9.3

Compressed Tensors v0.9.2

18 Feb 19:08
b8cf630
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.9.1...0.9.2