Bump torchao pin to pick up qmv_fast optimization #1541

manuelcandales · 2025-05-12T17:33:34Z

Performance improvements to lowbit quantized linear metal kernels in torchao. See AO PR#2167 for details.
The table below summarizes torchchat's decode speed (tokens/second) on Metal backend on M1 Max 64GB after this update

# bits	Llama 3.2-1B	Llama 3.2-3B	Llama 3.1-8B
1	179.96	87.01	51.10
2	186.91	98.13	54.37
3	170.62	85.69	48.05
4	175.15	89.54	50.51
5	147.10	70.19	38.58
6	140.51	63.62	35.48
7	131.27	64.19	32.69

pytorch-bot · 2025-05-12T17:33:38Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1541

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 3 Unrelated Failures

As of commit 0698a1d with merge base a37b08a ():

NEW FAILURE - The following job has failed:

pull / test-build-runner-et-android / linux-job (gh)
RuntimeError: Command docker exec -t 8f05e606be25d5248ca329f8bfacbfd7e9855aeb20a62dccfa0b5478722d0c63 /exec failed with exit code 1

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / runner-et (16-core-ubuntu) (gh) (trunk failure)
##[error]The operation was canceled.
pull / runner-et (macos-14-xlarge) (gh) (trunk failure)
Process completed with exit code 1.
pull / test-torchao-experimental-et (macos-14-xlarge) (gh) (trunk failure)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Jack-Khuu · 2025-05-12T20:19:34Z

You can ignore the ET issues I'm bumping it today

lmk when you want to land and I'll bypass/force

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 12, 2025

bump torchao pin

e6aa542

manuelcandales force-pushed the manuel/qmv-fast branch from dd8206e to e6aa542 Compare May 12, 2025 17:35

Merge branch 'main' into manuel/qmv-fast

0698a1d

Jack-Khuu approved these changes May 12, 2025

View reviewed changes

Jack-Khuu changed the title ~~bump torchao pin~~ Bump torchao pin to pick up qmv_fast optimization May 12, 2025

manuelcandales merged commit fd3059b into main May 13, 2025
68 of 72 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bump torchao pin to pick up qmv_fast optimization #1541

Bump torchao pin to pick up qmv_fast optimization #1541

Uh oh!

manuelcandales commented May 12, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented May 12, 2025 •

edited

Loading

Uh oh!

Jack-Khuu commented May 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Bump torchao pin to pick up qmv_fast optimization #1541

Bump torchao pin to pick up qmv_fast optimization #1541

Uh oh!

Conversation

manuelcandales commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1541

❌ 1 New Failure, 3 Unrelated Failures

Uh oh!

Jack-Khuu commented May 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

manuelcandales commented May 12, 2025 •

edited

Loading

pytorch-bot bot commented May 12, 2025 •

edited

Loading