[model_free_ptq] build job cleanup#2545
Conversation
… reads Each shard is processed independently with full parallelism. When fused weight sets (q/k/v, gate/up) span multiple shards, only the specific partner tensors needed for global scale fusion are fetched via targeted partial safetensors reads using safe_open. - build_weights_map(): maps tensor names to source files via index.json - _fetch_fused_partners(): partial reads of only fused partner tensors - validate_file(): add optional weights_map param for future use - One job per shard, no grouping, no cross-process coordination required - validate.py: remove NotImplementedError, cross-shard handled natively Closes #2497 Signed-off-by: David Zheng <dqzheng1996@gmail.com>
- Always use inverse_weights_map dict format for all jobs
- Standard jobs: {resolved_path: None} = load all tensors
- Microscale jobs: {src: [tensors]} = selective loading with cross-shard partners
- Update process_file and validate_file to accept inverse_weights_map format
- Add backward compatibility: isinstance check BEFORE .keys() call in both functions
- Move build_inverse_weights_map to microscale.py for better code organization (per Brian's feedback)
- Fix _build_validate_jobs to pass inverse_weights_map dict
- Update build_inverse_weights_map to handle empty weight_map
- Fix imports: get_checkpoint_files, is_weights_file from compressed_tensors.entrypoints.convert.file_utils
- Add match_name helper with 're:' regex pattern support to microscale.py
- Fix __all__ syntax in microscale.py
- Remove non-existent update_safetensors_index import and call
- Fix test imports, argument order, shard names, and ALL assertions to match new inverse_weights_map return format
Testing:
- pytest tests/llmcompressor/entrypoints/model_free/ -v: 16 passed, 1 skipped
- make style && make quality: all checks pass
Reviewer Feedback:
- Brian: Unified signature, inverse_weights_map per-job scope, single interface; move build_inverse_weights_map to microscale.py
- Kyle: Precomputed map, safe_open partial reads, partner re-saved
- Gemini: Single-file fallback, top-level imports, simplified discovery
Breaking Changes: None — internal refactoring only. Public API unchanged.
Signed-off-by: David Zheng <dqzheng1996@gmail.com>
Signed-off-by: David Zheng <dqzheng1996@gmail.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
|
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed. |
There was a problem hiding this comment.
Code Review
This pull request refactors the model-free PTQ entry point by unifying job construction for different quantization schemes and centralizing tensor loading logic into a new helper function. Key changes include renaming microscale-specific functions for clarity and updating tests to match the new API. Feedback identifies efficiency issues in the tensor loading loop and potential OOM risks when loading directly to a GPU, as well as a typo in a docstring.
|
The quality checks have failed. Please run |
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Brian Dellabetta <brian-dellabetta@users.noreply.github.com>
|
The quality checks have failed. Please run |
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
|
The quality checks have failed. Please run |
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
SUMMARY:
Follow-up to #2498 and pre-cursor to landing #2491.
This PR cleans up a few things:
build_inverse_weights_map->build_microscale_inverse_weights_mapbecause other reindexing logic will need different functionality when determining fused tensors._get_all_tensor_namesTEST PLAN:
No net new functionality, if all tests pass should be good to go