RFC: AWQModifier user experience #1972
brian-dellabetta
started this conversation in
RFCs
Replies: 2 comments 4 replies
-
|
Can you clarify what you mean by "disable quantization"? |
Beta Was this translation helpful? Give feedback.
2 replies
-
|
Is that first use case actually effective? It seems like your almost undoing the work of the AWQ grid search by doing GPTQ after, since you've explicitly grid searches with respect to a particular quant function. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
We want to provide users two new abilities with the AWQ algorithm:
I believe these draft implementations are not compatible.
Currently in #1973
User explicitly sets
AWQModifier(config_groups=None, targets=None, ...)(as opposed to its default value oftargets=["Linear"]. This likely isn't compatible with other proposed changes in #1961 to generalize AWQ quantization, which use targets/config_groups to determine how observers and quantization config should be attached to each module, and would allow us to support more than just W4A16 group-wise.Suggested Change
We add another field
AWQModifier(disable_quantization=True, ...)that allows the resolved quantization config to be used to attach observers and find best scales, but will just not do any of the quantization. If we want to adopt #1961 , I think we need to do this instead. We would have to be careful about how we runapply_quantization_confighere, there might be other side effects here.Beta Was this translation helpful? Give feedback.
All reactions