Skip to content

[AWQ] add option to take smooth layer quantization into accout #2296

@HDCharles

Description

@HDCharles

normally the way AWQ works is to pick a layer that is going to be quantized, try a bunch of scale factors to find the one that minimizes quantization error when that layer is quantized and then do an inverse rescale on the preceeding layer which is normally not quantized. However a problem arises for the up_proj -> down_proj mapping
because both the smooth and balance layers are targeted for quantization. Since we only take into account the quantization of the balance layers in our current AWQ implementation, we could be making the smooth layer harder to quantize with our choice of scale factor for the balance layer since the smooth layer is basically ignored during the quantization error calculation for quantizing the balance layer.

We should

  1. test if this has a significant impact
  2. add an option to enable this feature if its beneficial

STEPS

A) add a check here for whether smooth_name is in targeted_names and if so, change the get_lowest_common...etc search to include the smooth layer (this is how we determine what module is run to determine the quantization error, so we need smooth layer to be run if we're taking its quantization into accout)
B) add a flag to compute_best_scale for if the smooth layer is targeted
C) if necessary add the smooth layer to this dict
D) move the rescale weight code into a function which is called for each balance layer
E) if necessary, call the rescale weight code for 1/_scalesview on the smooth_layer
F) check whether this has an impact on lm_eval performance on some small set of models.
G) check how this affects the runtime of AWQ for those models.

if its beneficial then put up a PR with those changes demonstrating what was tested and how it affects things.

Metadata

Metadata

Assignees

Labels

awqFor any issue / PR related to AWQ supportenhancementNew feature or requestgood first issueA good first issue for users wanting to contribute

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions