Commit 7d0f7a9
QLoRA DDP export (#353)
## What does this PR do?
**Type of change:** New example <!-- Use one of the following: Bug fix,
new feature, new example, new tests, documentation. -->
**Overview:** This PR provides an e2e example for fine-tuning a model
using QLoRA with DDP and exporting checkpoint for deployment using vllm.
1. This PR contains a temporary fix for loading best checkpoint in the
end for DDP which can be removed once we move to using get_peft_model()
2. The final base checkpoint is exported under output_dir/base_model
while the adapter weights are exported under output_dir
## Usage
<!-- You can potentially add a usage example below. -->
Refer to README.md changes
## Testing
<!-- Mention how have you tested your change if applicable. -->
Trainer
- [x] `./launch.sh --model meta-llama/Meta-Llama-3-8B --num_epochs 0.01
--lr 1e-3 --do_train True --output_dir test --quant_cfg FP8_DEFAULT_CFG
--compress True --lora True`
Export
- [x] `python export.py --pyt_ckpt_path test --export_dir test-fp8 `
Deployment
- [x] `vllm serve test-fp8/base_model --enable-lora --lora-modules
sql-lora=test-fp8 --port 8090 --tokenizer test-fp8`
- [x] e2e unit test
- [ ] Sanity check weights, dtypes of generated checkpoint
- [x] Test phi4
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes <!--- If No, explain why.
-->
- **Did you write any new necessary tests?**: Yes
- **Did you add or update any necessary documentation?**: Yes
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Yes/No <!--- Only for new features, API changes, critical bug fixes or
bw breaking changes. -->
## Additional Information
<!-- E.g. related issue. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added an export CLI/tool to produce HuggingFace-ready checkpoints from
LoRA/QLoRA-trained models, including optional restoration of quantizer
state.
* **Improvements**
* Export now respects a QLoRA mode, filters and strips adapter entries
appropriately, and only emits per-layer quantization when weight
quantization is enabled. Saves model state earlier after quantization
and tightens checks for exporting quantized weights.
* **Documentation**
* Expanded LLM QAT README with QLoRA export and deployment guidance.
* **Tests**
* Re-enabled a previously skipped QLoRA test.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Suguna Velury <[email protected]>
Signed-off-by: sugunav14 <[email protected]>
Co-authored-by: realAsma <[email protected]>1 parent b1fc1fe commit 7d0f7a9
File tree
7 files changed
+224
-23
lines changed- examples/llm_qat
- modelopt/torch
- export
- quantization/plugins
- tests/examples/llm_qat
7 files changed
+224
-23
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| 38 | + | |
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
41 | 42 | | |
| 43 | + | |
42 | 44 | | |
43 | 45 | | |
44 | 46 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
301 | 301 | | |
302 | 302 | | |
303 | 303 | | |
304 | | - | |
| 304 | + | |
305 | 305 | | |
306 | | - | |
| 306 | + | |
307 | 307 | | |
308 | 308 | | |
309 | | - | |
310 | | - | |
311 | | - | |
| 309 | + | |
312 | 310 | | |
313 | 311 | | |
314 | 312 | | |
| |||
345 | 343 | | |
346 | 344 | | |
347 | 345 | | |
348 | | - | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
349 | 362 | | |
| 363 | + | |
| 364 | + | |
350 | 365 | | |
351 | 366 | | |
352 | 367 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
827 | 827 | | |
828 | 828 | | |
829 | 829 | | |
830 | | - | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
831 | 836 | | |
832 | 837 | | |
833 | 838 | | |
834 | 839 | | |
835 | 840 | | |
836 | 841 | | |
| 842 | + | |
837 | 843 | | |
838 | 844 | | |
839 | 845 | | |
| |||
845 | 851 | | |
846 | 852 | | |
847 | 853 | | |
| 854 | + | |
| 855 | + | |
| 856 | + | |
| 857 | + | |
| 858 | + | |
| 859 | + | |
| 860 | + | |
| 861 | + | |
| 862 | + | |
| 863 | + | |
| 864 | + | |
| 865 | + | |
848 | 866 | | |
849 | 867 | | |
850 | 868 | | |
| |||
855 | 873 | | |
856 | 874 | | |
857 | 875 | | |
858 | | - | |
859 | | - | |
860 | | - | |
861 | | - | |
862 | | - | |
863 | | - | |
| 876 | + | |
864 | 877 | | |
865 | 878 | | |
866 | 879 | | |
| |||
911 | 924 | | |
912 | 925 | | |
913 | 926 | | |
| 927 | + | |
| 928 | + | |
| 929 | + | |
| 930 | + | |
| 931 | + | |
914 | 932 | | |
915 | 933 | | |
916 | 934 | | |
| |||
1029 | 1047 | | |
1030 | 1048 | | |
1031 | 1049 | | |
| 1050 | + | |
1032 | 1051 | | |
1033 | 1052 | | |
1034 | 1053 | | |
| |||
1037 | 1056 | | |
1038 | 1057 | | |
1039 | 1058 | | |
| 1059 | + | |
1040 | 1060 | | |
1041 | 1061 | | |
1042 | 1062 | | |
| |||
1073 | 1093 | | |
1074 | 1094 | | |
1075 | 1095 | | |
1076 | | - | |
| 1096 | + | |
| 1097 | + | |
| 1098 | + | |
| 1099 | + | |
| 1100 | + | |
| 1101 | + | |
| 1102 | + | |
| 1103 | + | |
1077 | 1104 | | |
1078 | 1105 | | |
1079 | 1106 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
365 | 365 | | |
366 | 366 | | |
367 | 367 | | |
368 | | - | |
369 | | - | |
370 | | - | |
| 368 | + | |
371 | 369 | | |
372 | 370 | | |
373 | 371 | | |
| |||
458 | 456 | | |
459 | 457 | | |
460 | 458 | | |
461 | | - | |
| 459 | + | |
462 | 460 | | |
463 | 461 | | |
464 | 462 | | |
| |||
488 | 486 | | |
489 | 487 | | |
490 | 488 | | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
491 | 493 | | |
492 | 494 | | |
493 | 495 | | |
| |||
520 | 522 | | |
521 | 523 | | |
522 | 524 | | |
523 | | - | |
| 525 | + | |
524 | 526 | | |
525 | 527 | | |
526 | 528 | | |
| |||
Lines changed: 24 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
146 | 146 | | |
147 | 147 | | |
148 | 148 | | |
149 | | - | |
| 149 | + | |
150 | 150 | | |
151 | 151 | | |
152 | 152 | | |
| |||
209 | 209 | | |
210 | 210 | | |
211 | 211 | | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
212 | 215 | | |
213 | 216 | | |
214 | 217 | | |
215 | 218 | | |
216 | 219 | | |
217 | 220 | | |
218 | 221 | | |
219 | | - | |
220 | 222 | | |
221 | 223 | | |
222 | 224 | | |
| |||
275 | 277 | | |
276 | 278 | | |
277 | 279 | | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
278 | 299 | | |
279 | 300 | | |
280 | 301 | | |
| |||
337 | 358 | | |
338 | 359 | | |
339 | 360 | | |
340 | | - | |
| 361 | + | |
341 | 362 | | |
342 | 363 | | |
343 | 364 | | |
| |||
0 commit comments