Can UPGrad aggregator support gradient normalization? #408

suki00621 · 2025-08-25T15:18:56Z

suki00621
Aug 25, 2025

Hi, thanks for your great work!
I’m wondering whether the UPGrad aggregator can incorporate a gradient normalization step. Specifically, is it possible to normalize the task gradients to unit norm(rescale task gradients to have equal magnitudes before aggregation) before aggregation? If yes, could you please suggest how to implement this in the current code framework?

PierreQuinton · 2025-08-26T14:26:22Z

PierreQuinton
Aug 26, 2025
Maintainer

Hi, this is definitely possible. You simply need to define your own subclass of Aggregator which should provide a forward method that maps a matrix to the aggregation. In your case, you would normalize the raws of the matrix and then output the output of UPGrad() on the normalized matrix. Then you can provide this aggregator to backward or mtl_backward.

Let me know if this works. On a side note, I think this might be a bad idea (at least in theory) because gradient normalization will typically lead to weight divergence and so extreme over-fit.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can UPGrad aggregator support gradient normalization? #408

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Can UPGrad aggregator support gradient normalization? #408

Uh oh!

suki00621 Aug 25, 2025

Replies: 1 comment

Uh oh!

PierreQuinton Aug 26, 2025 Maintainer

suki00621
Aug 25, 2025

PierreQuinton
Aug 26, 2025
Maintainer