Add torchao mps ops #1415

manuelcandales · 2024-12-10T19:19:28Z

This PR adds the quantization scheme linear:afpwx. It quantizes only the weights in a groupwise manner with a specified bitwidth and groupsize. It takes arguments bitwidth (1, 2, 3, 4, 5, 6, 7) and groupsize (32, 64, 128, 256).

To use linear:afpwx, you must first set up the torchao mps experimental kernels. These will only work on a device with Apple Silicon.

From the torchchat root directory, run

bash torchchat/utils/scripts/build_torchao_ops.sh mps

Notice that this quantization scheme is currently implemented only for device mps.

python3 torchchat.py generate stories110M --device mps --dtype float32 --quantize '{"linear:afpwx": {"bitwidth": 4, "groupsize": 256}}' --prompt "Once upon a time,"

pytorch-bot · 2024-12-10T19:19:32Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1415

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ 1 Pending, 2 Unrelated Failures

As of commit c164f88 with merge base 4dc2f89 ():

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / test-cpu-aoti (aarch64, stories15M) (gh) (trunk failure)
Process completed with exit code 1.
pull / test-cpu-aoti (x86_64, stories15M) (gh) (trunk failure)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

metascroy · 2024-12-13T17:40:52Z

docs/quantization.md

+### Use
+
+#### linear:fpaxw
+The quantization scheme linear:fpaxw quantizes only the weights in a groupwise manner with a specified bitwidth and groupsize.


Should we keep the naming convention that torchchat uses (of "a" followed by type and "w" followed by type). This started with a8w4dq before I added any kernels. In your case this would be something like afpwx?

Ok, this makes sense.

metascroy · 2024-12-13T17:41:46Z

docs/quantization.md

+
+#### Eager mode
+```
+python3 torchchat.py generate stories110M --device mps --dtype float32 --quantize '{"linear:fpaxw": {"bitwidth": 4, "groupsize": 256}}' --prompt "Once upon a time," --num-samples 5


Do these only work with eager? If so, explicitly say that in the set-up section?

The metal lowbit kernels run with ExecuTorch as well (llama runner can use them). However, my aim in this torchchat PR was only to enable eager. I plan to have a follow up PR to enable them via the torchchat ET path as well. But I prefer to keep it modular.

I added a sentence in the setup section, clarifying that currently torchchat can only use them on Eager mode

metascroy · 2024-12-13T17:45:08Z

torchchat/utils/scripts/install_utils.sh

Can you add a CI test for the MPS kernels to make sure they install and run?

See https://github.com/pytorch/torchchat/blob/main/.github/workflows/pull.yml#L1060 as an example.

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 10, 2024

manuelcandales force-pushed the torchao-mps branch from e0c7ced to 0f1825c Compare December 10, 2024 19:30

manuelcandales requested review from kimishpatel and metascroy December 10, 2024 19:32

mikekgfb mentioned this pull request Dec 11, 2024

MacOS test falsely uses MPS, fails and is misreported as passing #1416

Open

manuelcandales force-pushed the torchao-mps branch from 0f1825c to 1cf4a7f Compare December 13, 2024 04:39

metascroy reviewed Dec 13, 2024

View reviewed changes

manuelcandales force-pushed the torchao-mps branch 3 times, most recently from 270044b to 75804d8 Compare December 13, 2024 18:56

metascroy approved these changes Dec 13, 2024

View reviewed changes

Add torchao mps ops

c164f88

manuelcandales force-pushed the torchao-mps branch from 75804d8 to c164f88 Compare December 13, 2024 19:14

manuelcandales merged commit 570aebc into main Dec 13, 2024
51 of 53 checks passed

vmpuri pushed a commit that referenced this pull request Feb 4, 2025

Add torchao mps ops (#1415)

36d0712

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add torchao mps ops #1415

Add torchao mps ops #1415

Uh oh!

manuelcandales commented Dec 10, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Dec 10, 2024 •

edited

Loading

Uh oh!

metascroy Dec 13, 2024

Uh oh!

manuelcandales Dec 13, 2024

Uh oh!

metascroy Dec 13, 2024 •

edited

Loading

Uh oh!

manuelcandales Dec 13, 2024

Uh oh!

manuelcandales Dec 13, 2024

Uh oh!

metascroy Dec 13, 2024

Uh oh!

manuelcandales Dec 13, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add torchao mps ops #1415

Add torchao mps ops #1415

Uh oh!

Conversation

manuelcandales commented Dec 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Dec 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1415

⏳ 1 Pending, 2 Unrelated Failures

Uh oh!

metascroy Dec 13, 2024

Choose a reason for hiding this comment

Uh oh!

manuelcandales Dec 13, 2024

Choose a reason for hiding this comment

Uh oh!

metascroy Dec 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

manuelcandales Dec 13, 2024

Choose a reason for hiding this comment

Uh oh!

manuelcandales Dec 13, 2024

Choose a reason for hiding this comment

Uh oh!

metascroy Dec 13, 2024

Choose a reason for hiding this comment

Uh oh!

manuelcandales Dec 13, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

manuelcandales commented Dec 10, 2024 •

edited

Loading

pytorch-bot bot commented Dec 10, 2024 •

edited

Loading

metascroy Dec 13, 2024 •

edited

Loading