[Config] Investigate config refactor for cleaner serialization, reduced logic

## Background ##
Right now, quantization configs are serialized through the following lifecycle:
1. [`apply_quantization_config`](https://github.com/neuralmagic/compressed-tensors/blob/main/src/compressed_tensors/quantization/lifecycle/apply.py#L110) is used to attach `quantization_scheme` attributes to modules
2. The model undergoes calibration and compression
3. The quantization config is regenerated from the model using [`QuantizationConifig.from_pretrained`](https://github.com/neuralmagic/compressed-tensors/blob/main/src/compressed_tensors/quantization/quant_config.py#L167)
4. The new config is serialized by [`ModelCompressor.update_config`](https://github.com/neuralmagic/compressed-tensors/blob/f525499568989a7527479fb24ade89bf8a1460aa/src/compressed_tensors/compressors/model_compressors/model_compressor.py#L769-L820)

This approach has some downsides (see [phi3 example config](https://huggingface.co/RedHatAI/Phi-3-vision-128k-instruct-W4A16-G128/blob/main/config.json))
* Any config group names set by the user are discarded
* The config groups which are generated do not necessarily match the config groups set by the user
* The ignore list becomes very large and ugly to read
* The logic for generating a config from a model is very difficult to maintain

The scope of this issue to investigate an approach whereby step (1) attaches the config as a `quantization_config` attribute on the model, which is then read by step (4) without having to go through step (3). This would mitigate all of the above downsides.

> ! Some things to keep in mind

* `apply_quantization_config` may be applied multiple times. This may necessitate some logic to "merge" quantization configs. This has been written as a draft already, feel free to ping @kylesayrs if you would like to leverage this, or feel free to use your own.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Config] Investigate config refactor for cleaner serialization, reduced logic #494

Background

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Config] Investigate config refactor for cleaner serialization, reduced logic #494

Description

Background

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions