[Common] Help gc to collect unused model outputs (#3643)

daniil-lyakhov · web-flow · commit 04a443988507 · 2025-09-16T16:07:59.000+03:00
### Changes Variable which is containing model outputs during statistic collection is explicitly freed after each iteration ### Reason for changes The quantization cell was failing on my machine with 125GB of RAM in the [SD v3 TorchFX notebook](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/stable-diffusion-v3-torch-fx/stable-diffusion-v3-torch-fx.ipynb). After checking that the pytorch code is not leaking by the [pytorch profiler tool](https://docs.pytorch.org/docs/stable/profiler) <img width="1870" height="1145" alt="image" src="https://github.com/user-attachments/assets/dba21040-8455-49b3-84fa-74e9aa3f9fd8" /> I confirmed that the problem is not related to statistic collection directly. After that I forced the garbage collection by the `del` statement - and my memory didn't exceed healthy 50GB during the runtime, and SQ was successfully applied to the model (in contrast with previous runs) ### Related tickets ### Tests I'm not sure which test should I do with that
diff --git a/src/nncf/common/tensor_statistics/aggregator.py b/src/nncf/common/tensor_statistics/aggregator.py
@@ -86,6 +86,11 @@ def collect_statistics(self, model: TModel, graph: NNCFGraph) -> None:
             outputs = engine.infer(input_data)
             processed_outputs = self._process_outputs(outputs)
             self._register_statistics(processed_outputs, merged_statistics)
+            # Manually dereference output tensors to hint gc to remove them. Without it,
+            # the processed_outputs and outputs remain during model inference,
+            # increasing the peak memory consumption.
+            del processed_outputs
+            del outputs
             processed_samples += 1
         if processed_samples == 0:
             raise nncf.ValidationError(EMPTY_DATASET_ERROR)