You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why the weight quantizer has two config? Does it mean that the bf16 weight will be first per-tensor quantized to FP8, than the FP8 weight will be per-group quantized to int4 weight?