-
Notifications
You must be signed in to change notification settings - Fork 453
[model_free_ptq] Enhance to work with previously quantized checkpoints like nvidia/DeepSeek-R1-NVFP4
#2228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
[model_free_ptq] Enhance to work with previously quantized checkpoints like nvidia/DeepSeek-R1-NVFP4
#2228
Changes from 2 commits
Commits
Show all changes
63 commits
Select commit
Hold shift + click to select a range
bdc5e5d
example p1
brian-dellabetta cde1c3a
p2
brian-dellabetta 06695a5
p2
brian-dellabetta a9a567f
use targets
brian-dellabetta 264636a
update quant config
brian-dellabetta 255f803
comments
brian-dellabetta 02bf5ee
script cleanup
brian-dellabetta 22a4758
minor cleanup
brian-dellabetta bfe4e5c
ignore default values
brian-dellabetta e713d4b
Merge branch 'main' into bdellabe/example-dsr1-nvfp4-fp8block
brian-dellabetta b6c9807
stylefixes
brian-dellabetta a4d4ad9
invert global input/weight scales
brian-dellabetta 5ee4758
fix
brian-dellabetta 64944e0
updates
brian-dellabetta 9f89d29
missing format
brian-dellabetta d79e0b9
minor touchups
brian-dellabetta e0e8ccb
comment typo
brian-dellabetta 302330e
merge main
brian-dellabetta 8339433
Processor protocol
brian-dellabetta 2c1f5d2
cleanup
brian-dellabetta f3e33a5
cleanup
brian-dellabetta c9c023a
cleanup
brian-dellabetta 0adf115
helper cleanup
brian-dellabetta 7f7663c
bugfix
brian-dellabetta a54d4cb
Merge branch 'main' into bdellabe/example-dsr1-nvfp4-fp8block
brian-dellabetta c49f401
fix logic, match_quantizable_tensors
brian-dellabetta 49683b6
Merge branch 'main' into bdellabe/example-dsr1-nvfp4-fp8block
brian-dellabetta 3b667fc
target regex update
brian-dellabetta 5fc016f
refactor to CT entrypoint
brian-dellabetta 179b70a
update create config
brian-dellabetta 69e9a4a
minor cleanup
brian-dellabetta 692bd13
fix overwrite qconfig
brian-dellabetta 2f882ef
revert example
brian-dellabetta 869d85d
Merge branch 'main' into bdellabe/example-dsr1-nvfp4-fp8block
brian-dellabetta 4b47725
refactor from CT changes
brian-dellabetta 3cb89dd
cleanup
brian-dellabetta 0ee7d9b
cleanup
brian-dellabetta 6120b26
post-refactor cleanup
brian-dellabetta 0663bd0
test cosmetics
brian-dellabetta 39f9442
docstrings
brian-dellabetta be73088
docstring
brian-dellabetta 7e241d0
minor refactor, exec_jobs
brian-dellabetta a5a1b43
prune find_safetensors_index_file
brian-dellabetta f4bb2d9
bugfix
brian-dellabetta 6fb2fb6
typo
brian-dellabetta 43b2c36
move simlar named helper to private
brian-dellabetta 75e6478
prune helper
brian-dellabetta 7f4cd5e
Merge branch 'main' into bdellabe/example-dsr1-nvfp4-fp8block
brian-dellabetta c59baa3
move entrypoints tests to dedicated folder
brian-dellabetta 2041946
move model free validate
brian-dellabetta a556514
entrypoints tests
brian-dellabetta b9ce613
cleanup
brian-dellabetta e025a5f
cleanup
brian-dellabetta 5ae4c63
rename example
brian-dellabetta 96daf7c
Merge branch 'main' into bdellabe/example-dsr1-nvfp4-fp8block
brian-dellabetta 2b1c26d
Merge branch 'main' into bdellabe/example-dsr1-nvfp4-fp8block
brian-dellabetta d7cba48
reindex_fused_weights
brian-dellabetta 07ccbbe
Merge branch 'main' into bdellabe/example-dsr1-nvfp4-fp8block
brian-dellabetta dd0be8e
test_calib_deepseekv3_module consistency fix
brian-dellabetta f520382
Merge branch 'main' into bdellabe/example-dsr1-nvfp4-fp8block
dsikka 1c6874f
failing test fix
brian-dellabetta 1ed8a9d
add not isnan assertion
brian-dellabetta 9c0a8dc
cicd test fix
brian-dellabetta File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,56 @@ | ||
| from llmcompressor import model_free_ptq | ||
| from compressed_tensors.quantization import ( | ||
| QuantizationScheme, | ||
| QuantizationArgs, | ||
| QuantizationStrategy, | ||
| QuantizationType, | ||
| ) | ||
|
|
||
| MODEL_ID = "nvidia/DeepSeek-R1-NVFP4" | ||
| SAVE_DIR = MODEL_ID.rstrip("/").split("/")[-1] + "-FP8-BLOCK" | ||
|
|
||
| # Apply FP8-Block to the model's compatible self_attn Linear layers | ||
| # Once quantized, the model is saved | ||
| # using compressed-tensors to the SAVE_DIR. | ||
| model_free_ptq( | ||
| model_stub=MODEL_ID, | ||
| save_directory=SAVE_DIR, | ||
| scheme=QuantizationScheme( | ||
| weights=QuantizationArgs( | ||
| num_bits=8, | ||
| type=QuantizationType.FLOAT, | ||
| strategy=QuantizationStrategy.BLOCK, | ||
| symmetric=True, | ||
| dynamic=False, | ||
| block_structure=[128, 128], | ||
| ), | ||
| input_activations=QuantizationArgs( | ||
| num_bits=8, | ||
| type=QuantizationType.FLOAT, | ||
| strategy=QuantizationStrategy.GROUP, | ||
| symmetric=True, | ||
| dynamic=True, | ||
| observer=None, | ||
| group_size=128, | ||
| ), | ||
| # TODO cannot set targets here, must be ["Linear"] | ||
| # targets=[ | ||
| # "re:.*self_attn.(o_proj|q_a_proj|q_b_proj).*" | ||
| # ], | ||
| ), | ||
| ignore=[ | ||
| # NOTE: self_attn.kv_a_proj_with_mqa has incompatible shape 576x7168 with block size 128x128 | ||
| # NOTE: self_attn.kv_b_proj is already dequantized by MLA | ||
| # This regex matches all strings that don't contain one of the following substrings: | ||
| # - self_attn.o_proj | ||
| # - self_attn.q_a_proj | ||
| # - self_attn.q_b_proj | ||
| "re:^(?!.*self_attn.(o_proj|q_a_proj|q_b_proj)).*$" | ||
| ], | ||
| max_workers=8, | ||
| device="cuda:0", | ||
| ) | ||
|
|
||
| # TODO reverse modelopt NVFP4 tensor packing order | ||
|
|
||
| # TODO merge hf_quant_config.json with CT quantization_config in config.json | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.