Skip to content

Conversation

@SS-JIA
Copy link
Contributor

@SS-JIA SS-JIA commented Oct 15, 2024

Stack from ghstack (oldest at bottom):

Context

This diff adds the VkInt4WeightOnlyQuantizer class which enables 4-bit quantization of linear layers via source transformation. This quantizer class is copied from torchao.quantization.GPTQ.WeightOnlyInt4Linear with some minor changes as annotated in the implementation.

Note that the pt2e quantization flow does not yet support groupwise quantization, so source transformation is the only way to perform groupwise quantization at the moment.

Differential Revision: D64406457

## Context

This diff adds the `VkInt4WeightOnlyQuantizer` class which enables 4-bit quantization of linear layers via source transformation. This quantizer class is copied from `torchao.quantization.GPTQ.WeightOnlyInt4Linear` with some minor changes as annotated in the implementation.

Note that the pt2e quantization flow does not yet support groupwise quantization, so source transformation is the only way to perform groupwise quantization at the moment.

Differential Revision: [D64406457](https://our.internmc.facebook.com/intern/diff/D64406457/)

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 15, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6234

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d2e8d29 with merge base 8673567 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added ciflow/periodic module: vulkan Issues related to the Vulkan delegate and code under backends/vulkan/ labels Oct 15, 2024
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 15, 2024
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64406457

## Context

This diff adds the `VkInt4WeightOnlyQuantizer` class which enables 4-bit quantization of linear layers via source transformation. This quantizer class is copied from `torchao.quantization.GPTQ.WeightOnlyInt4Linear` with some minor changes as annotated in the implementation.

Note that the pt2e quantization flow does not yet support groupwise quantization, so source transformation is the only way to perform groupwise quantization at the moment.

Differential Revision: [D64406457](https://our.internmc.facebook.com/intern/diff/D64406457/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64406457

## Context

This diff adds the `VkInt4WeightOnlyQuantizer` class which enables 4-bit quantization of linear layers via source transformation. This quantizer class is copied from `torchao.quantization.GPTQ.WeightOnlyInt4Linear` with some minor changes as annotated in the implementation.

Note that the pt2e quantization flow does not yet support groupwise quantization, so source transformation is the only way to perform groupwise quantization at the moment.

Differential Revision: [D64406457](https://our.internmc.facebook.com/intern/diff/D64406457/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64406457

## Context

This diff adds the `VkInt4WeightOnlyQuantizer` class which enables 4-bit quantization of linear layers via source transformation. This quantizer class is copied from `torchao.quantization.GPTQ.WeightOnlyInt4Linear` with some minor changes as annotated in the implementation.

Note that the pt2e quantization flow does not yet support groupwise quantization, so source transformation is the only way to perform groupwise quantization at the moment.

Differential Revision: [D64406457](https://our.internmc.facebook.com/intern/diff/D64406457/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64406457

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 58ee33d.

@SS-JIA SS-JIA deleted the gh/SS-JIA/115/head branch January 24, 2025 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/periodic CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged module: vulkan Issues related to the Vulkan delegate and code under backends/vulkan/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants