-
Notifications
You must be signed in to change notification settings - Fork 488
Description
Dear author,
I recognize that the Autoformer codes are cloned around many different repositories, and I came here to report an issue.
I noticed the Autoformer model can possibly produce
cumulative memory usage when the model instance is wrapped and compiled with torch.compile.
Under the same training function, the issue is observed
- ONLY for Autoformer model while training other types of models do not show similar problems
- Problem not reproduced when the model is not wrapped with
torch.compile - Memory usage cumulates til the end of the training loop (the entire epochs), despite
torch.cuda.empty_cache()gc.collect()or directdel <VARIABLES>were aggressively used inside the loop, ending up in out of memory error.
It is notable that FEDformer model which shares most of the blocks with Autoformer except the AutoCorrelation swapped to FourierBlock does not produce similar issue.
Contrary to my expectation, partially disabling compiling on AutoCorrelation forward did not solve the problem. Rather, disabling compiling of EncoderLayer slows down the speed of cumulation.
I resolved the problem by totally disabling compiling with @torch.compiler.disable() at the model's forward function.
os==ubuntu-22.04-lts
python==3.10
torch==2.6.0+cu118