Skip to content

Better handling of device and dtype for quantizers #1428

@Giuseppe5

Description

@Giuseppe5

Is your feature request related to a problem? Please describe.
device and dtype are special kwargs that should be passed along to quantizers from the layers.
Our quantizers are currently based on prefixes, but we should have a special case for device and dtype so that the user should not manually add weight_device and input_quant_device and so on.

Even if we were to treat device and dtype separately, there's another issue related to how we apply quantization, and the state dict.
We put everything on the meta device, and then loading the state dict will take care of the rest.

For quantization parameters, where no state dict is loaded, they are stuck on meta device unless handled properly.

If we can correctly propagate these params, this would solve some other annoying issues around re-init of quant_tensor.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions