Update pull.yml to test snapshot saving and loading #1486

mikekgfb · 2025-01-31T08:48:30Z

test snapshot saving and loading

pytorch-bot · 2025-01-31T08:48:33Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1486

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 5d098bd with merge base 083fdaf ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Fixed typos.

cuda-32.json because somebody would rather fail a job than accept a partil group

mikekgfb · 2025-02-08T21:08:49Z

@jerryzh168 @Jack-Khuu can you please have a look what happens with reloading of the Int4 quantized Linear class from torchao?

https://hud.pytorch.org/pr/pytorch/torchchat/1486#36825796920 shows this:

2025-02-07T02:27:59.4429304Z Traceback (most recent call last):
2025-02-07T02:27:59.4429938Z   File "/pytorch/torchchat/torchchat/cli/builder.py", line 653, in _initialize_model
2025-02-07T02:27:59.4430764Z     model = torch.load(builder_args.snapshot_path, weights_only=False)
2025-02-07T02:28:01.1584442Z Traceback (most recent call last):
2025-02-07T02:28:01.1585203Z   File "/home/ec2-user/actions-runner/_work/torchchat/torchchat/test-infra/.github/scripts/run_with_env_secrets.py", line 102, in <module>
2025-02-07T02:28:01.1585907Z             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-07T02:28:01.1586450Z   File "/opt/conda/lib/python3.11/site-packages/torch/serialization.py", line 1478, in load
2025-02-07T02:28:01.1586943Z     return _load(
2025-02-07T02:28:01.1587161Z            ^^^^^^
2025-02-07T02:28:01.1587709Z   File "/opt/conda/lib/python3.11/site-packages/torch/serialization.py", line 1971, in _load
2025-02-07T02:28:01.1588217Z     result = unpickler.load()
2025-02-07T02:28:01.1588484Z              ^^^^^^^^^^^^^^^^
2025-02-07T02:28:01.1588987Z   File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1938, in __getattr__
2025-02-07T02:28:01.1589534Z     raise AttributeError(
2025-02-07T02:28:01.1589913Z AttributeError: 'Linear' object has no attribute '_linear_extra_repr'

pull / test-gpu-aoti-bfloat16 (cuda, stories15M) / linux-job (gh)
AttributeError: 'Linear' object has no attribute '_linear_extra_repr'

@jerryzh168

Remove fp16 and fp32 int4 quantized models for now. @jerryzh168 Not sure why these dtypes are not compatible with int4 quantization?

Jack-Khuu · 2025-02-10T17:26:46Z

Thanks for the find, it's using cuda so it should be using the new subclass APIs too hmmm

torchchat/torchchat/utils/quantize.py

Lines 114 to 117 in 53a1004

    
           if (device == "cuda" or device == "xpu") and quantizer == "linear:int4": 
        
               quantize_(model, int4_weight_only(q_kwargs["groupsize"])) 
        
               if not support_tensor_subclass: 
        
                   unwrap_tensor_subclass(model)

add DEVICE specification for snapshot and use device cpu

mikekgfb · 2025-02-18T17:47:45Z

@Jack-Khuu updated to run at a minimum the CPU test correctly. Recommend we land that, then put a new PR to enable CUDA and you can assign to the right stakeholder working with torchao?

Jack-Khuu · 2025-02-18T18:01:31Z

I was just going to recommend that

Sounds like a plan, thanks for following up

jerryzh168 · 2025-02-18T19:36:56Z

thanks, will follow up later, created an issue to track here: pytorch/ao#1727

Update pull.yml to test snapshot saving and loading

343c94d

test snapshot saving and loading

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 31, 2025

mikekgfb added 3 commits January 31, 2025 09:54

Update pull.yml

30a0d31

Fixed typos.

Update pull.yml

2afca1e

cuda-32.json because somebody would rather fail a job than accept a partil group

Merge branch 'pytorch:main' into patch-48

cb24aec

Update pull.yml

e733606

Remove fp16 and fp32 int4 quantized models for now. @jerryzh168 Not sure why these dtypes are not compatible with int4 quantization?

Update pull.yml

5d098bd

add DEVICE specification for snapshot and use device cpu

Jack-Khuu approved these changes Feb 18, 2025

View reviewed changes

Jack-Khuu merged commit 384a728 into pytorch:main Feb 18, 2025
70 checks passed

jerryzh168 mentioned this pull request Feb 18, 2025

Problems loading int4 quantized linear in torchchat pytorch/ao#1727

Open

Jack-Khuu mentioned this pull request Feb 21, 2025

Update test-tinystories-executorch: Disable config mobile-32, fix typo #1495

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update pull.yml to test snapshot saving and loading #1486

Update pull.yml to test snapshot saving and loading #1486

Uh oh!

mikekgfb commented Jan 31, 2025

Uh oh!

pytorch-bot bot commented Jan 31, 2025 •

edited

Loading

Uh oh!

mikekgfb commented Feb 8, 2025

Uh oh!

Jack-Khuu commented Feb 10, 2025

Uh oh!

mikekgfb commented Feb 18, 2025

Uh oh!

Jack-Khuu commented Feb 18, 2025

Uh oh!

Uh oh!

jerryzh168 commented Feb 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Update pull.yml to test snapshot saving and loading #1486

Update pull.yml to test snapshot saving and loading #1486

Uh oh!

Conversation

mikekgfb commented Jan 31, 2025

Uh oh!

pytorch-bot bot commented Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1486

✅ No Failures

Uh oh!

mikekgfb commented Feb 8, 2025

Uh oh!

Jack-Khuu commented Feb 10, 2025

Uh oh!

mikekgfb commented Feb 18, 2025

Uh oh!

Jack-Khuu commented Feb 18, 2025

Uh oh!

Uh oh!

jerryzh168 commented Feb 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pytorch-bot bot commented Jan 31, 2025 •

edited

Loading