Replies: 1 comment
-
|
Hi, this is definitely possible. You simply need to define your own subclass of Let me know if this works. On a side note, I think this might be a bad idea (at least in theory) because gradient normalization will typically lead to weight divergence and so extreme over-fit. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, thanks for your great work!
I’m wondering whether the UPGrad aggregator can incorporate a gradient normalization step. Specifically, is it possible to normalize the task gradients to unit norm(rescale task gradients to have equal magnitudes before aggregation) before aggregation? If yes, could you please suggest how to implement this in the current code framework?
Beta Was this translation helpful? Give feedback.
All reactions