Release Compressed Tensors v0.12.0 · neuralmagic/compressed-tensors

What's Changed

Refactor module / parameter matching logic by @fynnsu in #406
Revert "Refactor module / parameter matching logic (#406)" by @kylesayrs in #429
Refactor module / parameter matching logic by @fynnsu in #431
Add quality check to CI and fix existing errors by @fynnsu in #408
Speed up nvfp4 pack/unpack w/ torch.compile by @fynnsu in #400
Simplify apply_quantization_config by @kylesayrs in #433
Install compilers etc to fix nightly test failure by @dhuangnm in #435
Fix a minor bug on GH hosted runners by @dhuangnm in #438
remove references to llmcompressor.transformers.oneshot in examples by @brian-dellabetta in #422
[Tests] Combine quantization and dequantization tests by @kylesayrs in #443
fix compress on meta device issue by @shanjiaz in #444
Throw error for unsupported activation strategies by @kylesayrs in #446
[Transform] Better dispatch support for offloaded and multi-gpu by @kylesayrs in #423
[Quantization Format] Add functionality to infer format by @dsikka in #441
Revert "[Quantization Format] Add functionality to infer format (#441)" by @dsikka in #451
Raise ValueError when nvfp4 pack tensor has odd number of columns by @fynnsu in #402
[Quantization] Allow dynamic group activation quantization by @kylesayrs in #450
Fix lint error on main by @fynnsu in #460
[Accelerate] Remove is_module_offloaded and update_prefix_dict by @kylesayrs in #366
[Decompression] Clean-up and some fixes by @dsikka in #461
[ModelCompressor] Remove missing keys and missing modules by @dsikka in #462
[Logging] Support use of loguru by @kylesayrs in #454
[Utils] Deprecate safe_permute by @kylesayrs in #464
[Quantization Format] Add functionality to infer format by @dsikka in #452
[licensing refactor] remove frozendict dependency, use types.MappingProxyType instead by @brian-dellabetta in #469
[Transform] Support loading random hadamards on meta device by @kylesayrs in #445
[transforms] TransformScheme.block_size, deprecate head_dim by @brian-dellabetta in #466
[Multi-Modifier] Scoped apply quantization config by @brian-dellabetta in #432
[Model Compressor] Move infer call to from_pretrained_model method by @dsikka in #470
Always save g_idx when initialized in quantization compressor by @rahul-tuli in #467
Add back get unexpected keys to support transformers lower bound by @dsikka in #475
Improve Hugging Face API utilization in tests by @dbarbuzzi in #473
[Transform] Revert deprecation of TransformScheme.head_dim for compatibility with vllm by @brian-dellabetta in #472
[cicd] Include Python version in artifact name by @dbarbuzzi in #477

New Contributors

@fynnsu made their first contribution in #406

Full Changelog: 0.11.0...0.12.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Compressed Tensors v0.12.0

What's Changed

New Contributors

Contributors

Uh oh!