Update TorchAO README inference section before PTC #3206

jerryzh168 · 2025-10-17T22:12:29Z

Summary:
att

Test Plan:
visual inspection

Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-bot · 2025-10-17T22:12:32Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3206

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 1 Cancelled Job, 2 Unrelated Failures

As of commit f7762ad with merge base ced6231 ():

NEW FAILURES - The following jobs have failed:

Run Regression Tests / test (CUDA 2.6, linux.g5.12xlarge.nvidia.gpu, torch==2.6.0, cuda, 12.6) / linux-job (gh)
test/dtypes/test_affine_quantized.py::TestAffineQuantized::test_copy__mismatch_metadata_apply_quant6
Run Regression Tests / test (CUDA 2.7, linux.g5.12xlarge.nvidia.gpu, torch==2.7.0, cuda, 12.6) / linux-job (gh)
test/dtypes/test_affine_quantized.py::TestAffineQuantized::test_copy__mismatch_metadata_apply_quant6
Run Regression Tests / test (CUDA 2.8, linux.g5.12xlarge.nvidia.gpu, torch==2.8.0, cuda, 12.6) / linux-job (gh)
test/dtypes/test_affine_quantized.py::TestAffineQuantized::test_copy__mismatch_metadata_apply_quant6

CANCELLED JOB - The following job was cancelled. Please retry:

Run Regression Tests (aarch64) / test-cpu-ops (linux.arm64.2xlarge) (gh)
##[error]The operation was canceled.

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Run Regression Tests (aarch64) / test-cpu-ops (macos-14) (gh) (trunk failure)
/Users/runner/miniconda3/envs/venv/lib/python3.10/site-packages/executorch/share/cmake/../../include/executorch/runtime/core/memory_allocator.h:143:21: error: no matching function for call to 'mul_overflows'
Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh) (trunk failure)
test/dtypes/test_nf4.py::TestComm::test_comm

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2025-10-17T22:26:25Z

README.md

 from torchao.quantization import Int4WeightOnlyConfig, quantize_
-quantize_(model, Int4WeightOnlyConfig(group_size=32, version=1))
-```
-Compared to a `torch.compiled` bf16 baseline, your quantized model should be significantly smaller and faster on a single A100 GPU:


removing these since toy model memory/latency is not meaningful, to make our README shorter

jerryzh168 · 2025-10-17T22:27:19Z

README.md


 TorchAO is integrated into some of the leading open-source libraries including:

 * HuggingFace transformers with a [builtin inference backend](https://huggingface.co/docs/transformers/main/quantization/torchao) and [low bit optimizers](https://github.com/huggingface/transformers/pull/31865)


reordered a bit to put more commonly used ones earlier

README.md

Summary: att Test Plan: visual inspection Reviewers: Subscribers: Tasks: Tags:

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 17, 2025

jerryzh168 added topic: documentation Use this tag if this PR adds or improves documentation and removed CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. labels Oct 17, 2025

jerryzh168 force-pushed the update-readme-10-2025 branch from b0fc829 to 081118f Compare October 17, 2025 22:25

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 17, 2025

jerryzh168 commented Oct 17, 2025

View reviewed changes

jerryzh168 force-pushed the update-readme-10-2025 branch from 081118f to d45a249 Compare October 17, 2025 22:31

supriyar reviewed Oct 17, 2025

View reviewed changes

README.md Show resolved Hide resolved

Update TorchAO README inference section before PTC

f7762ad

Summary: att Test Plan: visual inspection Reviewers: Subscribers: Tasks: Tags:

jerryzh168 force-pushed the update-readme-10-2025 branch from d45a249 to f7762ad Compare October 18, 2025 00:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update TorchAO README inference section before PTC #3206

Update TorchAO README inference section before PTC #3206

jerryzh168 commented Oct 17, 2025

Uh oh!

pytorch-bot bot commented Oct 17, 2025 •

edited

Loading

Uh oh!

jerryzh168 Oct 17, 2025 •

edited

Loading

Uh oh!

jerryzh168 Oct 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		TorchAO is integrated into some of the leading open-source libraries including:

		* HuggingFace transformers with a [builtin inference backend](https://huggingface.co/docs/transformers/main/quantization/torchao) and [low bit optimizers](https://github.com/huggingface/transformers/pull/31865)

Update TorchAO README inference section before PTC #3206

Are you sure you want to change the base?

Update TorchAO README inference section before PTC #3206

Conversation

jerryzh168 commented Oct 17, 2025

Uh oh!

pytorch-bot bot commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3206

❌ 3 New Failures, 1 Cancelled Job, 2 Unrelated Failures

Uh oh!

jerryzh168 Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot bot commented Oct 17, 2025 •

edited

Loading

jerryzh168 Oct 17, 2025 •

edited

Loading