Skip to content

[Config] Investigate config refactor for cleaner serialization, reduced logic #494

@kylesayrs

Description

@kylesayrs

Background

Right now, quantization configs are serialized through the following lifecycle:

  1. apply_quantization_config is used to attach quantization_scheme attributes to modules
  2. The model undergoes calibration and compression
  3. The quantization config is regenerated from the model using QuantizationConifig.from_pretrained
  4. The new config is serialized by ModelCompressor.update_config

This approach has some downsides (see phi3 example config)

  • Any config group names set by the user are discarded
  • The config groups which are generated do not necessarily match the config groups set by the user
  • The ignore list becomes very large and ugly to read
  • The logic for generating a config from a model is very difficult to maintain

The scope of this issue to investigate an approach whereby step (1) attaches the config as a quantization_config attribute on the model, which is then read by step (4) without having to go through step (3). This would mitigate all of the above downsides.

! Some things to keep in mind

  • apply_quantization_config may be applied multiple times. This may necessitate some logic to "merge" quantization configs. This has been written as a draft already, feel free to ping @kylesayrs if you would like to leverage this, or feel free to use your own.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions