Scheduler + gradient clipping with AMP16 (manual optimization) #10880
Unanswered
OverLordGoldDragon
asked this question in
Lightning Trainer API: Trainer, LightningModule, LightningDataModule
Replies: 1 comment
-
def on_before_optimizer_step(self, optimizer, optimizer_idx):
self.clip_gradients(optimizer,
gradient_clip_val=self.hparams.gradient_clip_val,
gradient_clip_algorithm=self.hparams.gradient_clip_algorithm) seems to work |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
How should it be done? Couldn't sort it from docs. Suppose
It's unclear where
self.clip_gradients
should go;def optimizer_step
(override) never gets called with above code. Clipping should follow unscaling, which is done with automatic optimization; firsthttps://github.com/PyTorchLightning/pytorch-lightning/blob/20bef8327f52248a02dfc6c013afb90089d01519/pytorch_lightning/plugins/precision/native_amp.py#L87-L88
then
https://github.com/PyTorchLightning/pytorch-lightning/blob/20bef8327f52248a02dfc6c013afb90089d01519/pytorch_lightning/plugins/precision/precision_plugin.py#L127
lastly
https://github.com/PyTorchLightning/pytorch-lightning/blob/20bef8327f52248a02dfc6c013afb90089d01519/pytorch_lightning/plugins/precision/native_amp.py#L93
With manual, however, it'd seem we'd have to recreate a bunch of internals to insert clipping between unscaling and weight updates. It's also unclear how
optimizer_step
overriding works since it's never called while using docs code for other manual optimization.Related to #9923?
Beta Was this translation helpful? Give feedback.
All reactions