Feature: add cli option for torch's built in matmul precision on supported graphics cards#413
Feature: add cli option for torch's built in matmul precision on supported graphics cards#413AeneasTews wants to merge 2 commits intojwohlwend:mainfrom
Conversation
When using a supported card and using preview build of pytorch (currently tested on version 2.8.0.dev20250616+cu128 and NVIDIA 5070 Ti) pytorch informs about the availability of matmulprecision which results in drastically improved runtimes when using high or medium instead of highest setting. This commit includes a command line option to toggle this based on user preference, default is highest. Keeping the default at highest should not cause any compatibility issues, as this is also the current default.
|
Here's another vote for this! I almost made the same PR, but then I saw that someone beat me to it... I've edited this manually in the past, and I've seen a significant speedup going from It might also be worth checking the warnings filter as part of this. There's a call to |
|
We've found in the past that using high or medium can hurt performance, so I m not super eager to incentivize users to do this. It's not just a question of card compatibility. |
|
@jwohlwend I don't see a problem if it's done in the way this PR does it: keep the default at The main thing that would incentivize people to drop the accuracy level is the big "you're not fully using your GPU" warning message that PyTorch prints out, but that's not touched by this PR. As I mentioned above, it seems like there's code to suppress that message, but I still see it, even with the latest release. |
|
No that's my point, we've observed accuracy issues with TF32 |
|
Oh! I thought you meant "performance" in the time/efficiency sense. If there are accuracy problems then I agree that this is a lot more dubious. |
|
@jwohlwend thank you very much for your responses, would you be able to provide me with the tests that you performend to determine accuracy deterioration when using different levels of precision? Thank you very much for your help! Best regards! |
When using a supported card and using preview build of pytorch (currently tested on version 2.8.0.dev20250616+cu128 and NVIDIA 5070 Ti) pytorch informs about the availability of matmulprecision which results in drastically improved runtimes when using high or medium instead of highest setting. Improvements can be up to 100% faster. This commit includes a command line option to toggle this based on user preference, default is highest. Keeping the default at highest should not cause any compatibility issues, as this is also the current default.