Support mxfp nvfp lmhead quant by WeiweiZhang1 · Pull Request #1051 · intel/auto-round

WeiweiZhang1 · 2025-11-20T14:08:30Z

related issue: #1040

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

for more information, see https://pre-commit.ci

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

for more information, see https://pre-commit.ci

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

for more information, see https://pre-commit.ci

wenhuach21 · 2025-11-21T08:06:54Z

auto_round/compressors/base.py

            if layer_name not in layer_inputs:
+                if self.act_bits < 16 and not self.act_dynamic:
+                    logger.warning(
+                        f"Due to insufficient resources: act_max hook for layer '{layer_name}' is unavailable. "


support it via block-wise forward, please refer to auto-scheme code. @xin3he could you take this task?

if there are quantized layers outside the blocks, we could switch to this mode

if there are quantized layers outside the blocks, we could switch to this mode

this happens within _quantize_layers function, which only for outside of block layers. don't understand what switching to this mode means.

AutoScheme provide an advanced way that could calibrate all the layers in the model with block-wise forward

auto_round/compressors/base.py

auto_round/export/export_to_llmcompressor/export_to_fp.py

xin3he

LGTM

…ntel/auto-round into support_mxfp_nvfp_lmhead_quant

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

…rt_mxfp_nvfp_lmhead_quant

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

…ntel/auto-round into support_mxfp_nvfp_lmhead_quant

for more information, see https://pre-commit.ci

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

auto_round/export/export_to_autogptq/export.py

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

for more information, see https://pre-commit.ci

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

…ntel/auto-round into support_mxfp_nvfp_lmhead_quant

WeiweiZhang1 and others added 13 commits October 9, 2025 15:31

fp8 exporting bugfix

719e5ab

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

Merge branch 'main' of https://github.com/intel/auto-round into main

8e8b04f

Merge branch 'main' of https://github.com/intel/auto-round into main

57842a1

refine exllama backend cuda UT

c2daa79

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

Merge branch 'main' of https://github.com/intel/auto-round into main

ca36a70

Merge branch 'main' of https://github.com/intel/auto-round into main

9ab0843

add lm_head layer act_max hook, enable mxfp/nvfp lm_head export

8176d2e

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

fbba8a6

for more information, see https://pre-commit.ci

fixtypo

4d097f8

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

fixtypo

024dfc0

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

925f038

for more information, see https://pre-commit.ci

fix ut typo

d7681f9

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

7c2c255

for more information, see https://pre-commit.ci

WeiweiZhang1 requested a review from wenhuach21 November 21, 2025 07:39

wenhuach21 reviewed Nov 21, 2025

View reviewed changes

wenhuach21 reviewed Nov 24, 2025

View reviewed changes

auto_round/compressors/base.py Outdated Show resolved Hide resolved

wenhuach21 reviewed Nov 24, 2025

View reviewed changes

auto_round/export/export_to_llmcompressor/export_to_fp.py Show resolved Hide resolved

xin3he approved these changes Nov 25, 2025

View reviewed changes

WeiweiZhang1 and others added 7 commits November 27, 2025 01:14

Merge branch 'support_mxfp_nvfp_lmhead_quant' of https://github.com/i…

5c92161

…ntel/auto-round into support_mxfp_nvfp_lmhead_quant

refine logs, fix pack_layer for awq&gptq

a35f804

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

Merge branch 'main' of https://github.com/intel/auto-round into suppo…

3b3b666

…rt_mxfp_nvfp_lmhead_quant

refine log, fix pack_layer for awq&gptq

cc78096

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

Merge branch 'support_mxfp_nvfp_lmhead_quant' of https://github.com/i…

d461adf

…ntel/auto-round into support_mxfp_nvfp_lmhead_quant

[pre-commit.ci] auto fixes from pre-commit.com hooks

a6b914b

for more information, see https://pre-commit.ci

fixtypo

4d807c0

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

wenhuach21 reviewed Nov 27, 2025

View reviewed changes

auto_round/export/export_to_autogptq/export.py Show resolved Hide resolved

WeiweiZhang1 and others added 4 commits November 28, 2025 01:22

add awq&gptq lm_head UT

17c71f7

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

f96274d

for more information, see https://pre-commit.ci

fix local path

31a30c7

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

Merge branch 'support_mxfp_nvfp_lmhead_quant' of https://github.com/i…

6f8f4a5

…ntel/auto-round into support_mxfp_nvfp_lmhead_quant

WeiweiZhang1 merged commit c4a1479 into main Nov 28, 2025
26 checks passed

WeiweiZhang1 deleted the support_mxfp_nvfp_lmhead_quant branch November 28, 2025 09:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support mxfp nvfp lmhead quant#1051

Support mxfp nvfp lmhead quant#1051
WeiweiZhang1 merged 24 commits intomainfrom
support_mxfp_nvfp_lmhead_quant

WeiweiZhang1 commented Nov 20, 2025 •

edited

Loading

Uh oh!

wenhuach21 Nov 21, 2025

Uh oh!

wenhuach21 Nov 21, 2025 •

edited

Loading

Uh oh!

WeiweiZhang1 Nov 21, 2025

Uh oh!

wenhuach21 Nov 21, 2025

Uh oh!

Uh oh!

Uh oh!

xin3he left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

WeiweiZhang1 commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wenhuach21 Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

wenhuach21 Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WeiweiZhang1 Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

wenhuach21 Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

xin3he left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

WeiweiZhang1 commented Nov 20, 2025 •

edited

Loading

wenhuach21 Nov 21, 2025 •

edited

Loading