Skip to content

[WIP] [STEP 2] split compressor into few quantizers#841

Closed
n1ck-guo wants to merge 2 commits intomainfrom
hengguo/quantizers
Closed

[WIP] [STEP 2] split compressor into few quantizers#841
n1ck-guo wants to merge 2 commits intomainfrom
hengguo/quantizers

Conversation

@n1ck-guo
Copy link
Copy Markdown
Contributor

@n1ck-guo n1ck-guo commented Sep 23, 2025

quantizer

Replaces the original compressor's quantize method and is responsible for the specific quantization process
Subclasses (coarse to fine granularity):

  • mode (RTN, Tune): Different quantizer processes and quantize function logic.
  • model_type (llm, vlm, diffusion): Different calibration methods, data processing, etc.
  • data_type (gguf, mxfp8, waquanizer): Requires additional algorithms (imatrix for gguf), special processes (register_act_max_hook for WA, fused_layer_global_scale for nvfp)

Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you!

@github-actions github-actions bot added the Stale label Mar 25, 2026
@n1ck-guo n1ck-guo closed this Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant