Replies: 2 comments 1 reply
-
It does seem an interesting idea in principle, I havn't read about any canonical way of doing it though. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Is it very complicated to just clip gradients in the standard update function, e.g.,
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I'm looking for a way to clip gradients based on their distributions (in a minibatch) at each time step, lowering the norm of only the extreme ones. Basically I want to max out those gradients that lie outside some percentile at a given value (https://en.wikipedia.org/wiki/Winsorizing). I'm surprised this doesn't exist in most frameworks ? Is there a principled way to do this or should I write this myself ? This function https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mstats.winsorize.html has not been implemented in Jax yet.
Thanks in advance,
Mathis
Beta Was this translation helpful? Give feedback.
All reactions