-
Notifications
You must be signed in to change notification settings - Fork 6.4k
fix torchao memory check #12539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix torchao memory check #12539
Conversation
|
For the |
|
I just ran the test on an H100 and it worked fine. |
It seems like a device-related issue. Can we change it to a big model so other devices can also work? Or change the ratio as in this PR. We want to make the test pass on XPU and A100 |
Hi @sayakpaul , could you share the ratio on H100? |
Signed-off-by: jiqing-feng <[email protected]>
The test
pytest -rA tests/quantization/torchao/test_torchao.py::TorchAoTest::test_model_memory_usagefailed withon A100. I guess it is because the model is too small, most memories are consumed on cuda kernel launch instead of model weight. If we change it to a large model like
black-forest-labs/FLUX.1-dev, the ratio will be24244073472 / 12473665536 = 1.9436206143278139@sayakpaul . Please review this PR. Thanks!