You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SUMMARY:
Added dispatch for generation
TEST PLAN:
```
python3 examples/quantization_w8a8_fp8/fp8_block_example.py
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [02:18<00:00, 8.67s/it]
2025-08-26T12:27:52.974527+0000 | reset | INFO - Compression lifecycle reset
2025-08-26T12:27:53.024383+0000 | _create_default_logger | INFO - Logging all LLM Compressor modifier-level logs to sparse_logs/26-08-2025_12.27.53.log
2025-08-26T12:27:53.024771+0000 | from_modifiers | INFO - Creating recipe from modifiers
2025-08-26T12:27:55.045066+0000 | initialize | INFO - Compression lifecycle initialized for 1 modifiers
2025-08-26T12:27:55.045382+0000 | IndependentPipeline | INFO - Inferred `DataFreePipeline` for `QuantizationModifier`
Some parameters are on the meta device because they were offloaded to the cpu.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 49975/49975 [00:00<00:00, 568185.21it/s]
Calibrating weights: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 49975/49975 [06:36<00:00, 126.20it/s]
2025-08-26T12:35:34.595007+0000 | finalize | INFO - Compression lifecycle finalized for 1 modifiers
2025-08-26T12:35:42.534632+0000 | post_process | WARNING - Optimized model is not saved. To save, please provide`output_dir` as input arg.Ex. `oneshot(..., output_dir=...)`
========== SAMPLE GENERATION ==============
Some parameters are on the meta device because they were offloaded to the cpu.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Hello my name is Lillie and I'm a student in the 7th grade. I have a math problem
==========================================
2025-08-26T12:36:53.305881+0000 | get_model_compressor | INFO - skip_sparsity_compression_stats set to True. Skipping sparsity compression statistic calculations. No sparsity compressor will be applied.
```
Signed-off-by: shanjiaz <[email protected]>
0 commit comments