Skip to content

Compressed Tensors v0.12.0

Compare
Choose a tag to compare
@dhuangnm dhuangnm released this 01 Oct 16:30
· 12 commits to main since this release
0d8c7c3

What's Changed

  • Refactor module / parameter matching logic by @fynnsu in #406
  • Revert "Refactor module / parameter matching logic (#406)" by @kylesayrs in #429
  • Refactor module / parameter matching logic by @fynnsu in #431
  • Add quality check to CI and fix existing errors by @fynnsu in #408
  • Speed up nvfp4 pack/unpack w/ torch.compile by @fynnsu in #400
  • Simplify apply_quantization_config by @kylesayrs in #433
  • Install compilers etc to fix nightly test failure by @dhuangnm in #435
  • Fix a minor bug on GH hosted runners by @dhuangnm in #438
  • remove references to llmcompressor.transformers.oneshot in examples by @brian-dellabetta in #422
  • [Tests] Combine quantization and dequantization tests by @kylesayrs in #443
  • fix compress on meta device issue by @shanjiaz in #444
  • Throw error for unsupported activation strategies by @kylesayrs in #446
  • [Transform] Better dispatch support for offloaded and multi-gpu by @kylesayrs in #423
  • [Quantization Format] Add functionality to infer format by @dsikka in #441
  • Revert "[Quantization Format] Add functionality to infer format (#441)" by @dsikka in #451
  • Raise ValueError when nvfp4 pack tensor has odd number of columns by @fynnsu in #402
  • [Quantization] Allow dynamic group activation quantization by @kylesayrs in #450
  • Fix lint error on main by @fynnsu in #460
  • [Accelerate] Remove is_module_offloaded and update_prefix_dict by @kylesayrs in #366
  • [Decompression] Clean-up and some fixes by @dsikka in #461
  • [ModelCompressor] Remove missing keys and missing modules by @dsikka in #462
  • [Logging] Support use of loguru by @kylesayrs in #454
  • [Utils] Deprecate safe_permute by @kylesayrs in #464
  • [Quantization Format] Add functionality to infer format by @dsikka in #452
  • [licensing refactor] remove frozendict dependency, use types.MappingProxyType instead by @brian-dellabetta in #469
  • [Transform] Support loading random hadamards on meta device by @kylesayrs in #445
  • [transforms] TransformScheme.block_size, deprecate head_dim by @brian-dellabetta in #466
  • [Multi-Modifier] Scoped apply quantization config by @brian-dellabetta in #432
  • [Model Compressor] Move infer call to from_pretrained_model method by @dsikka in #470
  • Always save g_idx when initialized in quantization compressor by @rahul-tuli in #467
  • Add back get unexpected keys to support transformers lower bound by @dsikka in #475
  • Improve Hugging Face API utilization in tests by @dbarbuzzi in #473
  • [Transform] Revert deprecation of TransformScheme.head_dim for compatibility with vllm by @brian-dellabetta in #472
  • [cicd] Include Python version in artifact name by @dbarbuzzi in #477

New Contributors

Full Changelog: 0.11.0...0.12.0