Added quantization for evaluation script #11822

rohansjoshi · 2025-06-20T15:29:20Z

Summary:
Added quantization to evaluation script. Quantization causes deterioriation in accuracy

On wikitext task:

Model Name	max_seq_len	ptq	word_perplexity
Llama 3.2-1B Instruct	128	16a4w	5821003.055178451
Llama 3.2-1B Instruct	128	16a4w_block	5396240.078572427
Llama 3.2-1B Instruct	128	8a8w	533154.970440251

Differential Revision: D76837572

pytorch-bot · 2025-06-20T15:29:24Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11822

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 1 Pending

As of commit 6cd35a3 with merge base a12a005 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-06-20T15:29:30Z

This pull request was exported from Phabricator. Differential Revision: D76837572

github-actions · 2025-06-20T15:30:06Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Summary: Added quantization to evaluation script. Quantization causes deterioriation in accuracy On wikitext task: | Model Name | max_seq_len | ptq | word_perplexity |----------|----------|----------|-----------| | Llama 3.2-1B Instruct | 128 | 16a4w | 5821003.055178451 | | Llama 3.2-1B Instruct | 128 | 16a4w_block | 5396240.078572427 | | Llama 3.2-1B Instruct | 128 | 8a8w | 533154.970440251 | Differential Revision: D76837572

facebook-github-bot · 2025-06-20T15:57:39Z

This pull request was exported from Phabricator. Differential Revision: D76837572

Summary: Pull Request resolved: pytorch#11822 Added quantization to evaluation script. Quantization causes deterioriation in accuracy On wikitext task: | Model Name | max_seq_len | ptq | word_perplexity |----------|----------|----------|-----------| | Llama 3.2-1B Instruct | 128 | 16a4w | 5821003.055178451 | | Llama 3.2-1B Instruct | 128 | 16a4w_block | 5396240.078572427 | | Llama 3.2-1B Instruct | 128 | 8a8w | 533154.970440251 | Differential Revision: D76837572

cccclai

Thank you for the ppl summary table

Summary: Added quantization to evaluation script. Quantization causes deterioriation in accuracy On wikitext task: | Model Name | max_seq_len | ptq | word_perplexity |----------|----------|----------|-----------| | Llama 3.2-1B Instruct | 128 | 16a4w | 5821003.055178451 | | Llama 3.2-1B Instruct | 128 | 16a4w_block | 5396240.078572427 | | Llama 3.2-1B Instruct | 128 | 8a8w | 533154.970440251 | Reviewed By: cccclai Differential Revision: D76837572

facebook-github-bot · 2025-06-20T21:42:57Z

This pull request was exported from Phabricator. Differential Revision: D76837572

Summary: Pull Request resolved: pytorch#11822 Added quantization to evaluation script. Quantization causes deterioriation in accuracy On wikitext task: | Model Name | max_seq_len | ptq | word_perplexity |----------|----------|----------|-----------| | Llama 3.2-1B Instruct | 128 | 16a4w | 5821003.055178451 | | Llama 3.2-1B Instruct | 128 | 16a4w_block | 5396240.078572427 | | Llama 3.2-1B Instruct | 128 | 8a8w | 533154.970440251 | Reviewed By: cccclai Differential Revision: D76837572

Summary: Added quantization to evaluation script. Quantization causes deterioriation in accuracy On wikitext task: | Model Name | max_seq_len | ptq | word_perplexity |----------|----------|----------|-----------| | Llama 3.2-1B Instruct | 128 | 16a4w | 5821003.055178451 | | Llama 3.2-1B Instruct | 128 | 16a4w_block | 5396240.078572427 | | Llama 3.2-1B Instruct | 128 | 8a8w | 533154.970440251 | Reviewed By: cccclai Differential Revision: D76837572

Summary: Pull Request resolved: pytorch#11822 Added quantization to evaluation script. Quantization causes deterioriation in accuracy On wikitext task: | Model Name | max_seq_len | ptq | word_perplexity |----------|----------|----------|-----------| | Llama 3.2-1B Instruct | 128 | 16a4w | 5821003.055178451 | | Llama 3.2-1B Instruct | 128 | 16a4w_block | 5396240.078572427 | | Llama 3.2-1B Instruct | 128 | 8a8w | 533154.970440251 | Reviewed By: cccclai Differential Revision: D76837572

facebook-github-bot · 2025-06-21T21:28:01Z

This pull request was exported from Phabricator. Differential Revision: D76837572

Differential Revision: D76837572 Pull Request resolved: pytorch#11822

rohansjoshi requested a review from cccclai as a code owner June 20, 2025 15:29

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 20, 2025

facebook-github-bot added the fb-exported label Jun 20, 2025

rohansjoshi force-pushed the export-D76837572 branch from 4fba0c7 to 88a66d7 Compare June 20, 2025 15:54

rohansjoshi force-pushed the export-D76837572 branch from 88a66d7 to 9607154 Compare June 20, 2025 15:57

cccclai approved these changes Jun 20, 2025

View reviewed changes

rohansjoshi force-pushed the export-D76837572 branch from 9607154 to 8de8d9b Compare June 20, 2025 21:39

rohansjoshi force-pushed the export-D76837572 branch from 8de8d9b to d3228aa Compare June 20, 2025 21:43

rohansjoshi force-pushed the export-D76837572 branch from d3228aa to c9c6560 Compare June 21, 2025 21:24

rohansjoshi force-pushed the export-D76837572 branch from c9c6560 to 6cd35a3 Compare June 21, 2025 21:27

facebook-github-bot merged commit 608a745 into pytorch:main Jun 21, 2025
102 of 104 checks passed

hinriksnaer pushed a commit to hinriksnaer/executorch that referenced this pull request Jun 26, 2025

Added quantization for evaluation script

78deaaa

Differential Revision: D76837572 Pull Request resolved: pytorch#11822

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added quantization for evaluation script #11822

Added quantization for evaluation script #11822

Uh oh!

rohansjoshi commented Jun 20, 2025

Uh oh!

pytorch-bot bot commented Jun 20, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Jun 20, 2025

Uh oh!

github-actions bot commented Jun 20, 2025

Uh oh!

facebook-github-bot commented Jun 20, 2025

Uh oh!

cccclai left a comment

Uh oh!

facebook-github-bot commented Jun 20, 2025

Uh oh!

facebook-github-bot commented Jun 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Added quantization for evaluation script #11822

Added quantization for evaluation script #11822

Uh oh!

Conversation

rohansjoshi commented Jun 20, 2025

Uh oh!

pytorch-bot bot commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11822

⏳ No Failures, 1 Pending

Uh oh!

facebook-github-bot commented Jun 20, 2025

Uh oh!

github-actions bot commented Jun 20, 2025

This PR needs a release notes: label

Uh oh!

facebook-github-bot commented Jun 20, 2025

Uh oh!

cccclai left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jun 20, 2025

Uh oh!

facebook-github-bot commented Jun 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pytorch-bot bot commented Jun 20, 2025 •

edited

Loading

This PR needs a `release notes:` label