[model_free_ptq] build job cleanup by brian-dellabetta · Pull Request #2545 · vllm-project/llm-compressor

brian-dellabetta · 2026-03-30T21:28:41Z

SUMMARY:
Follow-up to #2498 and pre-cursor to landing #2491.

This PR cleans up a few things:

Use the same function signature for building standard jobs, microscale jobs, and validation jobs. These will be needed in DeepSeek V3.2 support #2491.
Renamed microscale-specific build_inverse_weights_map -> build_microscale_inverse_weights_map because other reindexing logic will need different functionality when determining fused tensors.
Prunes unused _get_all_tensor_names
Breaks out loading logic for inverse_weights_map to a helper that can be moved to CT in follow-up DeepSeek V3.2 support #2491

TEST PLAN:
No net new functionality, if all tests pass should be good to go

… reads Each shard is processed independently with full parallelism. When fused weight sets (q/k/v, gate/up) span multiple shards, only the specific partner tensors needed for global scale fusion are fetched via targeted partial safetensors reads using safe_open. - build_weights_map(): maps tensor names to source files via index.json - _fetch_fused_partners(): partial reads of only fused partner tensors - validate_file(): add optional weights_map param for future use - One job per shard, no grouping, no cross-process coordination required - validate.py: remove NotImplementedError, cross-shard handled natively Closes #2497 Signed-off-by: David Zheng <dqzheng1996@gmail.com>

- Always use inverse_weights_map dict format for all jobs - Standard jobs: {resolved_path: None} = load all tensors - Microscale jobs: {src: [tensors]} = selective loading with cross-shard partners - Update process_file and validate_file to accept inverse_weights_map format - Add backward compatibility: isinstance check BEFORE .keys() call in both functions - Move build_inverse_weights_map to microscale.py for better code organization (per Brian's feedback) - Fix _build_validate_jobs to pass inverse_weights_map dict - Update build_inverse_weights_map to handle empty weight_map - Fix imports: get_checkpoint_files, is_weights_file from compressed_tensors.entrypoints.convert.file_utils - Add match_name helper with 're:' regex pattern support to microscale.py - Fix __all__ syntax in microscale.py - Remove non-existent update_safetensors_index import and call - Fix test imports, argument order, shard names, and ALL assertions to match new inverse_weights_map return format Testing: - pytest tests/llmcompressor/entrypoints/model_free/ -v: 16 passed, 1 skipped - make style && make quality: all checks pass Reviewer Feedback: - Brian: Unified signature, inverse_weights_map per-job scope, single interface; move build_inverse_weights_map to microscale.py - Kyle: Precomputed map, safe_open partial reads, partner re-saved - Gemini: Single-file fallback, top-level imports, simplified discovery Breaking Changes: None — internal refactoring only. Public API unchanged. Signed-off-by: David Zheng <dqzheng1996@gmail.com>

Signed-off-by: David Zheng <dqzheng1996@gmail.com>

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

github-actions · 2026-03-30T21:28:49Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

gemini-code-assist

Code Review

This pull request refactors the model-free PTQ entry point by unifying job construction for different quantization schemes and centralizing tensor loading logic into a new helper function. Key changes include renaming microscale-specific functions for clarity and updating tests to match the new API. Feedback identifies efficiency issues in the tensor loading loop and potential OOM risks when loading directly to a GPU, as well as a typo in a docstring.

src/llmcompressor/entrypoints/model_free/process.py

mergify · 2026-03-30T21:30:33Z

The quality checks have failed. Please run make style and make quality under
the root directory to adddress the lint failures. You will need to install the
dev optional install to get the required linting packages:
https://github.com/vllm-project/llm-compressor/blob/main/CONTRIBUTING.md

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Brian Dellabetta <brian-dellabetta@users.noreply.github.com>

mergify · 2026-03-30T21:33:24Z

The quality checks have failed. Please run make style and make quality under
the root directory to adddress the lint failures. You will need to install the
dev optional install to get the required linting packages:
https://github.com/vllm-project/llm-compressor/blob/main/CONTRIBUTING.md

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

mergify · 2026-03-30T21:35:56Z

The quality checks have failed. Please run make style and make quality under
the root directory to adddress the lint failures. You will need to install the
dev optional install to get the required linting packages:
https://github.com/vllm-project/llm-compressor/blob/main/CONTRIBUTING.md

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

dzhengAP and others added 12 commits March 25, 2026 01:19

Merge branch 'main' into model-free-ptq-runtime-optimization

193b87d

Merge branch 'main' into model-free-ptq-runtime-optimization

aa57deb

fix: remove stale exports from helpers __all__, fix long line in test

b9914c9

Signed-off-by: David Zheng <dqzheng1996@gmail.com>

Merge branch 'main' into model-free-ptq-runtime-optimization

35cc30b

fix changed imports; style fix

4fb1784

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

make process/validate function signatures uniform

498d38e

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

Merge branch 'main' into model-free-ptq-runtime-optimization

2ed8b25

style fixes

b92af83

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

Merge branch 'main' into bdellabe/model-free-ptq-cleanup

177efba

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

cleanup p2

b029136

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

brian-dellabetta requested review from HDCharles, dsikka and kylesayrs as code owners March 30, 2026 21:28

gemini-code-assist bot reviewed Mar 30, 2026

View reviewed changes

src/llmcompressor/entrypoints/model_free/process.py Show resolved Hide resolved

src/llmcompressor/entrypoints/model_free/process.py Outdated Show resolved Hide resolved

make signatures the same

627713d

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

mergify bot added quality-failed and removed quality-failed labels Mar 30, 2026

Apply suggestion from @gemini-code-assist[bot]

8bdf31d

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Brian Dellabetta <brian-dellabetta@users.noreply.github.com>

mergify bot added the quality-failed label Mar 30, 2026

gemini suggestion

ccecc3d

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

mergify bot removed the quality-failed label Mar 30, 2026

mergify bot added the quality-failed label Mar 30, 2026

brian-dellabetta and others added 2 commits March 30, 2026 17:39

Merge branch 'main' into bdellabe/model-free-ptq-cleanup

e3e4586

stylefixes

8073749

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

mergify bot removed the quality-failed label Mar 30, 2026

Merge branch 'main' into bdellabe/model-free-ptq-cleanup

d7b7c67

kylesayrs approved these changes Mar 31, 2026

View reviewed changes

Merge branch 'main' into bdellabe/model-free-ptq-cleanup

5c7e9f0

brian-dellabetta added the ready When a PR is ready for review label Mar 31, 2026

HDCharles approved these changes Mar 31, 2026

View reviewed changes

brian-dellabetta merged commit 031d912 into main Mar 31, 2026
13 of 15 checks passed

brian-dellabetta deleted the bdellabe/model-free-ptq-cleanup branch March 31, 2026 19:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[model_free_ptq] build job cleanup#2545

[model_free_ptq] build job cleanup#2545
brian-dellabetta merged 19 commits intomainfrom
bdellabe/model-free-ptq-cleanup

brian-dellabetta commented Mar 30, 2026

Uh oh!

github-actions bot commented Mar 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Mar 30, 2026

Uh oh!

mergify bot commented Mar 30, 2026

Uh oh!

mergify bot commented Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

brian-dellabetta commented Mar 30, 2026

Uh oh!

github-actions bot commented Mar 30, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Mar 30, 2026

Uh oh!

mergify bot commented Mar 30, 2026

Uh oh!

mergify bot commented Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants