Releases · neuralmagic/compressed-tensors

07 Oct 14:49

dhuangnm

0.12.2

2dd1b62

Compressed Tensors v0.12.2 Latest

Latest

What's Changed

remove deprecated safe_permute by @brian-dellabetta in #471
[Transform] Fix accelerate import to keep it as optional dependency by @tboerstad in #480
[Bugfix] Fix Per-Token Dynamic Activation Quantization by @max410011 in #393

New Contributors

@tboerstad made their first contribution in #480
@max410011 made their first contribution in #393

Full Changelog: 0.12.1...0.12.2

Contributors

tboerstad, brian-dellabetta, and max410011

Assets 4

02 Oct 18:53

dhuangnm

0.12.1

8206c93

Compressed Tensors v0.12.1

What's Changed

[Patch Fix] Add get_missing_module_keys to support transformers lower bound by @dsikka in #479
[cicd] Add post-release run to push next nightly by @dbarbuzzi in #478

Full Changelog: 0.12.0...0.12.1

Contributors

dbarbuzzi and dsikka

Assets 4

01 Oct 16:30

dhuangnm

0.12.0

0d8c7c3

Compressed Tensors v0.12.0

What's Changed

Refactor module / parameter matching logic by @fynnsu in #406
Revert "Refactor module / parameter matching logic (#406)" by @kylesayrs in #429
Refactor module / parameter matching logic by @fynnsu in #431
Add quality check to CI and fix existing errors by @fynnsu in #408
Speed up nvfp4 pack/unpack w/ torch.compile by @fynnsu in #400
Simplify apply_quantization_config by @kylesayrs in #433
Install compilers etc to fix nightly test failure by @dhuangnm in #435
Fix a minor bug on GH hosted runners by @dhuangnm in #438
remove references to llmcompressor.transformers.oneshot in examples by @brian-dellabetta in #422
[Tests] Combine quantization and dequantization tests by @kylesayrs in #443
fix compress on meta device issue by @shanjiaz in #444
Throw error for unsupported activation strategies by @kylesayrs in #446
[Transform] Better dispatch support for offloaded and multi-gpu by @kylesayrs in #423
[Quantization Format] Add functionality to infer format by @dsikka in #441
Revert "[Quantization Format] Add functionality to infer format (#441)" by @dsikka in #451
Raise ValueError when nvfp4 pack tensor has odd number of columns by @fynnsu in #402
[Quantization] Allow dynamic group activation quantization by @kylesayrs in #450
Fix lint error on main by @fynnsu in #460
[Accelerate] Remove is_module_offloaded and update_prefix_dict by @kylesayrs in #366
[Decompression] Clean-up and some fixes by @dsikka in #461
[ModelCompressor] Remove missing keys and missing modules by @dsikka in #462
[Logging] Support use of loguru by @kylesayrs in #454
[Utils] Deprecate safe_permute by @kylesayrs in #464
[Quantization Format] Add functionality to infer format by @dsikka in #452
[licensing refactor] remove frozendict dependency, use types.MappingProxyType instead by @brian-dellabetta in #469
[Transform] Support loading random hadamards on meta device by @kylesayrs in #445
[transforms] TransformScheme.block_size, deprecate head_dim by @brian-dellabetta in #466
[Multi-Modifier] Scoped apply quantization config by @brian-dellabetta in #432
[Model Compressor] Move infer call to from_pretrained_model method by @dsikka in #470
Always save g_idx when initialized in quantization compressor by @rahul-tuli in #467
Add back get unexpected keys to support transformers lower bound by @dsikka in #475
Improve Hugging Face API utilization in tests by @dbarbuzzi in #473
[Transform] Revert deprecation of TransformScheme.head_dim for compatibility with vllm by @brian-dellabetta in #472
[cicd] Include Python version in artifact name by @dbarbuzzi in #477

New Contributors

@fynnsu made their first contribution in #406

Full Changelog: 0.11.0...0.12.0

Contributors

dbarbuzzi, brian-dellabetta, and 6 other contributors

Assets 4

19 Aug 19:02

dhuangnm

0.11.0

b78eb8f

Compressed Tensors v0.11.0

What's Changed

Fix nightly issues with python 3.11 by @dhuangnm in #371
Clean up disk space and restore ubuntu 22.04 runner by @dhuangnm in #373
[Transform] Update tests to use conftest file by @kylesayrs in #367
[Transform] Hadamard Permutations by @kylesayrs in #329
[Transform] Construct on GPU, cache on CPU by @kylesayrs in #352
enable code coverage collection and reporting INFERENG-1049 by @derekk-nm in #382
Deprecate iter_named_leaf_modules and iter_named_quantizable_modules by @kylesayrs in #381
Added support for compression on meta device by @shanjiaz in #376
Add torch.float64 as a viable dtype for scales by @eldarkurtic in #379
[Transform] apply_transform_config by @kylesayrs in #348
[Compression] Fix compression device movement in cases of indexed devices by @kylesayrs in #384
Enable code coverage report for nightly tests by @dhuangnm in #388
[Bugfix] Only quant-compress modules with weight quantization by @kylesayrs in #387
[Transform] Fix config serialization by @kylesayrs in #396
[Transform] Do not fuse div operation into hadamard matrices by @kylesayrs in #395
[Transform] Implement multi-headed transforms by @kylesayrs in #383
Support DeepSeekV3-style block FP8 quantization by @mgoin in #372
[Transform] [Utils] Canonical matching utilities by @kylesayrs in #392
[Bugfix] Safeguard against submodule parameter deletion in decompress_model by @kylesayrs in #347
fix block quantization initialization by @shanjiaz in #403
[Utils] Skip internal modules when matching by @kylesayrs in #404
[Quantization][Decompression] Fix QDQ for dynamic quant; Update NVFP4 Compression Params by @dsikka in #407
[Utils] Support matching vLLM modules by @kylesayrs in #413
Fix block size inference logic by @shanjiaz in #411
[Transform] Serialize with tied weights by @kylesayrs in #370
[Transform] [Utils] Support precision, add torch dtype validation by @kylesayrs in #414
[Transform] Serialize transforms config by @kylesayrs in #412
Error when configs are created with unrecognized fields by @kylesayrs in #386
revert forbid constraint on QuantizationConfig by @brian-dellabetta in #418
Revert "[Transform] Serialize transforms config (#412)" by @dsikka in #419
added wrapper for execution device by @shanjiaz in #417
[Transform] Serialize config (include format) by @dsikka in #420
exclude transform_config from quantization_config parse by @brian-dellabetta in #421
[Quantization] Support more than one quant-compressor by @dsikka in #415
[QuantizationScheme] Validate format by @dsikka in #424
[Utils] Expand is_match by @kylesayrs in #416
fix match.py syntax by @shanjiaz in #426
[Offload] Fully remove dispatch by @kylesayrs in #427

Full Changelog: 0.10.2...0.11.0

Contributors

mgoin, brian-dellabetta, and 6 other contributors

Assets 4

23 Jun 13:20

dhuangnm

0.10.2

38cbdd1

Compressed Tensors v0.10.2

What's Changed

[Hotfix] Implement quantization compressor methods on dense compressor by @kylesayrs in #344
[Hotfix] Implement method on dense compressor by @kylesayrs in #345
[Transform] Factory classes with shared memory and offloading by @kylesayrs in #316
[Transform] [Bugfix] Fix enum value serialization in python>=3.11 by @kylesayrs in #350
Remove redundant call by @eldarkurtic in #349
[Accelerate] Rename and simplify force_cpu_offload by @kylesayrs in #354
[Transform] Extend set of known Hadamard matrices by @kylesayrs in #351
[Accelerate] Fix offloaded_dispatch, implement disable_offloading by @kylesayrs in #355
[Accelerate] Extend functionality of register_offload_parameter by @kylesayrs in #356
[Bugfix] Fix saving of models dispatched by offloaded_dispatch by @kylesayrs in #357
[Bugfix] Only update direct params in disable_offloading by @kylesayrs in #360
reference updated reportportal_submit_execution_results action by @derekk-nm in #362
[Accelerate] Expand get_execution_device to support models by @kylesayrs in #363
[Accelerate] Fix typos in get_execution_device by @kylesayrs in #365

New Contributors

@derekk-nm made their first contribution in #362

Full Changelog: 0.10.1...0.10.2

Contributors

eldarkurtic, kylesayrs, and derekk-nm

Assets 4

06 Jun 18:26

dhuangnm

0.10.1

f5dbfc3

Compressed Tensors v0.10.1

What's Changed

[Transform] Hadamard and Matrix Transform Utils by @kylesayrs in #330
Fix error on import whenever accelerate is absent by @maresb in #342

New Contributors

@maresb made their first contribution in #342

Full Changelog: 0.10.0...0.10.1

Contributors

maresb and kylesayrs

Assets 4

05 Jun 17:51

dhuangnm

0.10.0

d7ce8ec

Compressed Tensors v0.10.0

What's Changed

Updates to build system by @dbarbuzzi in #304
[Utils] add align_modules by @kylesayrs in #282
Enable module state_dict compression, simplify compression logic by @kylesayrs in #302
Fix _initialize_scale_zero_point initializing on the wrong device by @mgoin in #295
Revert "Enable module state_dict compression, simplify compression lo… by @kylesayrs in #306
[Bugfix] Fix shape calculation for group quantization by @kylesayrs in #308
Enable module state_dict compression, simplify compression logic by @kylesayrs in #307
Clarify decompression return type by @kylesayrs in #310
Clarify match_param_name return type by @kylesayrs in #312
[Compressor][NVFP4] Support FP4 Compression by @dsikka in #311
[NVFP4] Update FloatArgs and NVFP4 by @dsikka in #313
fix signatures on model_validator functions by @brian-dellabetta in #314
[Performance] Add memory compression and decompression pathways by @kylesayrs in #301
Model Compression: Set compression status by @kylesayrs in #318
[NVFP4] Enable Fp4 Quantization; introduce / apply global_scales by @dsikka in #315
[NVFP4] Skip fused global scale calculation if already fused by @dsikka in #322
Update default observer to be MSE by @shanjiaz in #300
[Misc] Generics typehinting for RegistryMixin by @kylesayrs in #320
Revert "Update default observer to be MSE (#300)" by @dsikka in #323
[NVFP4] Add tensor_group strategy; enable NVFP4 Activations by @dsikka in #317
[Transforms] Transform Args, Scheme, and Config by @kylesayrs in #321
[NVFP4] Expand dynamic types, clean-up conditions by @dsikka in #325
Use different runner for UPLOAD job by @dbarbuzzi in #327
[NVFP4] Use torch.compile when rounding to NVFP4 by @dsikka in #331
[Tests] Update test_fp8_quant.py by @dsikka in #337
[Tests] Fix test scale init for group quant by @dsikka in #338
[Quantization] Update group quantization by @dsikka in #336
[NVFP4] update global scale generation by @dsikka in #339
[Transform] Accelerate Utilities by @kylesayrs in #328
Model Compression: Delete offload by @kylesayrs in #319
[Decompression] Keep unused parameters when decompressing from memory by @kylesayrs in #340
[NVFP4] Small Nits by @dsikka in #341

New Contributors

@shanjiaz made their first contribution in #300

Full Changelog: 0.9.4...0.10.0

Contributors

dbarbuzzi, mgoin, and 4 other contributors

Assets 4

24 Apr 19:21

dbarbuzzi

0.9.4

8aa8b82

Compressed Tensors v0.9.4

What's Changed

Remove compression_ratio calculation by @dsikka in #293
Build with setuptools scm by @dhuangnm in #292
fix a few minor issues by @dhuangnm in #294
Some fixes for AWQ by @rahul-tuli in #269
Fix upload issue when package already existed on PyPI by @dhuangnm in #297
Update action tags by @dhuangnm in #298
Pick up fix from nm-actions by @dhuangnm in #299
[Compressor] Update packed compressor to support zp packing by @dsikka in #296
[Decompression] Update Decompression Lifecycle by @dsikka in #285
[Accelerate] allow get_execution_device to be used when initializing a model by @kylesayrs in #303

Full Changelog: 0.9.3...0.9.4

Contributors

kylesayrs, dsikka, and 2 other contributors

Assets 4

02 Apr 17:13

dhuangnm

0.9.3

4574747

Compressed Tensors v0.9.3

What's Changed

remove testmo by @dhuangnm in #258
update tag for summary-test action by @dhuangnm in #259
[Bugfix] Support offloaded parameters when initializing KV cache parameters by @kylesayrs in #261
Update: CompressedLinear to decompress once by @rahul-tuli in #266
[BugFix]: AttributeError in CompressedLinear by @rahul-tuli in #273
Fix case when using weight_packed, not weight by @dsikka in #278
Report test results to Report Portal by @dhuangnm in #271
use fine-grained token for workflow by @dhuangnm in #283
Rectify Asym Compression/Decompression Pathways by @dsikka in #225
Bump CT Version by @dsikka in #288

Full Changelog: 0.9.2...0.9.3

Contributors

kylesayrs, dsikka, and 2 other contributors

Assets 4

18 Feb 19:08

dhuangnm

0.9.2

b8cf630

Compressed Tensors v0.9.2

What's Changed

ModelCompressor type checking import by @kylesayrs in #220
Fix warning for dynamic quantization args by @kylesayrs in #227
Depreciate get_observer by @kylesayrs in #214
Accelerate Utilities: Throw warning when updating with different shapes by @kylesayrs in #231
Use faster operations on packed-quantized, add tests by @horheynm in #211
Update build workflow to Python 3.12 by @dbarbuzzi in #248
Replace COMPRESSION_PARAM_NAMES with Abstract Property by @rahul-tuli in #249
Kylesayrs/update readme by @brian-dellabetta in #252
Add: missing and unexpected keys in ModelCompressor by @rahul-tuli in #250
switch runners by @dhuangnm in #254
Bump version for patch release by @dsikka in #255

Full Changelog: 0.9.1...0.9.2

Contributors

dbarbuzzi, brian-dellabetta, and 5 other contributors

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

What's Changed

Contributors

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

What's Changed

Contributors

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

What's Changed

Contributors

Uh oh!

What's Changed

Contributors

Uh oh!

What's Changed

Contributors

Uh oh!

Releases: neuralmagic/compressed-tensors

Compressed Tensors v0.12.2

What's Changed

New Contributors

Contributors

Uh oh!

Compressed Tensors v0.12.1

What's Changed

Contributors

Uh oh!

Compressed Tensors v0.12.0

What's Changed

New Contributors

Contributors

Uh oh!

Compressed Tensors v0.11.0

What's Changed

Contributors

Uh oh!

Compressed Tensors v0.10.2

What's Changed

New Contributors

Contributors

Uh oh!

Compressed Tensors v0.10.1

What's Changed

New Contributors

Contributors

Uh oh!

Compressed Tensors v0.10.0

What's Changed

New Contributors

Contributors

Uh oh!

Compressed Tensors v0.9.4

What's Changed

Contributors

Uh oh!

Compressed Tensors v0.9.3

What's Changed

Contributors

Uh oh!

Compressed Tensors v0.9.2

What's Changed

Contributors

Uh oh!