Skip to content

Fix AutoScheme low memory flag propagation from CLI#1596

Merged
chensuyue merged 4 commits intomainfrom
lvl/fix_cli_low_cpu_mem
Mar 27, 2026
Merged

Fix AutoScheme low memory flag propagation from CLI#1596
chensuyue merged 4 commits intomainfrom
lvl/fix_cli_low_cpu_mem

Conversation

@lvliang-intel
Copy link
Copy Markdown
Contributor

Description

This PR fixes inconsistent memory-mode behavior between the main AutoRound flow and AutoScheme when running from CLI.

low_gpu_mem_usage and low_cpu_mem_usage from CLI were passed to AutoRound but were not passed to AutoScheme.
This caused AutoScheme to run under a different memory strategy than the main quantization flow, which could lead to unexpectedly high CPU RAM usage and misleading memory behavior during AutoScheme generation.

CUDA_VISIBLE_DEVICES=6 python -m auto_round /mnt/disk2/lvl/Qwen3-1.7B/ --target_bits 2.5 --ignore_scale_zp_bits --options "gguf:q2_k_s,gguf:q4_k_s" --iters 1 --nsamples 16

Before fix:
image

After fix:
image

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Other (please specify):

Related Issues

#1586

Checklist Before Submitting

  • My code has been tested locally.
  • Documentation has been updated as needed.
  • New or updated tests are included where applicable.

Copilot AI review requested due to automatic review settings March 23, 2026 05:50
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes inconsistent memory-strategy behavior when AutoScheme is invoked via the CLI --avg_bits/--options path by ensuring the same low-memory flags used by the main AutoRound flow are applied to AutoScheme.

Changes:

  • Move low_cpu_mem_usage resolution earlier so it’s available before AutoScheme construction.
  • Pass low_gpu_mem_usage and low_cpu_mem_usage into the AutoScheme(...) constructor when --avg_bits is used.
  • Update the CLI help text for the deprecated --low_cpu_mem_usage flag.

lvliang-intel and others added 2 commits March 23, 2026 14:20
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@chensuyue chensuyue merged commit 11debae into main Mar 27, 2026
30 checks passed
@chensuyue chensuyue deleted the lvl/fix_cli_low_cpu_mem branch March 27, 2026 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants