-
Notifications
You must be signed in to change notification settings - Fork 743
Fixed bug for 16a4w ptq #12167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed bug for 16a4w ptq #12167
Conversation
Summary: Currently running the script executorch/examples/models/llama/export_llama.py with the flag --ptq 16a4w, it does 16a16w quantization; this diff fixes this. This may be related to some GitHub issues Differential Revision: D77671468
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12167
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 Cancelled JobsAs of commit 338770d with merge base 967cfae ( CANCELLED JOBS - The following jobs were cancelled. Please retry:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
This pull request was exported from Phabricator. Differential Revision: D77671468 |
This PR needs a
|
cccclai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix! @haowhsu-quic @winskuo-quic @chunit-quic fyi for the fix
Summary: Currently running the script executorch/examples/models/llama/export_llama.py with the flag --ptq 16a4w, it does 16a16w quantization; this diff fixes this. This may be related to some GitHub issues Differential Revision: D77671468
Summary: Currently running the script executorch/examples/models/llama/export_llama.py with the flag --ptq 16a4w, it does 16a16w quantization; this diff fixes this. This may be related to some GitHub issues
Differential Revision: D77671468