[pt2e] Make prepare and convert faster by caching (#2983)

andrewor14 · web-flow · commit be7143422519 · 2025-09-11T19:21:48.000-04:00
**Summary:** This is the torchao version of pytorch/pytorch#162550 by @navsud. Including the PR description here again: D79674759 tried to fix the expensive prepare and convert steps, as assert_and_get_unique_device was called multiple times. This change fixes that issue by using functools.cache decorator. **Test Plan:** Verified on llm export to QNN. LLM Quantization prepare time of ~20min reduced to ~3min.
diff --git a/torchao/utils.py b/torchao/utils.py
@@ -49,6 +49,7 @@
 
 
 # Referenced from: https://github.com/pytorch/pytorch/blob/9105d54c6b37099575c0059ef274c86c4dc80c57/torch/ao/quantization/utils.py#L711
+@functools.cache
 def _assert_and_get_unique_device(module: torch.nn.Module) -> Any:
     """
     Returns the unique device for a module, or None if no device is found.