You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am experimenting with trying to use the optimizers pattern to make some sort of an optimizer that tries it's best to reduce the learning rate if some loss increas condition happens. I think the best way to do this is include the step_size in the optimizer state. This is fine, but the step_size modification update condition depends on prev_loss and current loss which means that one must pass both gradients and loss into the update and it seems the checks in the optimizer complain about this.
@optimizerdefsgd_custom(step_size):
"""Construct optimizer triple for stochastic gradient descent but with some dynamic step_size thing. Args: step_size: positive scalar, or a callable representing a step size schedule that maps the iteration index to positive scalar. Returns: An (init_fun, update_fun, get_params) triple. """# step_size = make_schedule(step_size)definit(x0, loss0=None, step_size0=step_size):
ifloss0isNone:
loss0=np.Infreturnx0, loss0, step_size0defupdate(i, g_loss, state):
g, loss=g_loss# this breaks the api to some extent, now need to pass in value with gradientsx, prev_loss, step_size=stateifloss/prev_loss>1.1:
step_size*=0.5x=x-step_size(i) *greturnx, loss, step_sizedefget_params(state):
x, _, _=statereturnx
If you try using this you get some error like
TypeError: optimizer update function was passed a gradient tree that did not match the parameter tree structure with which it was initialized: parameter tree PyTreeDef(*) and grad tree PyTreeDef((*, *)).
Am wondering if a) this is a bad idea in general for other reasons b) if not bad, is it worth thinking about allowing this pattern?
Or maybe this multi-step pattern exists elsewhere in the optimizer code and I have not yet come across it.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
I am experimenting with trying to use the optimizers pattern to make some sort of an optimizer that tries it's best to reduce the learning rate if some loss increas condition happens. I think the best way to do this is include the step_size in the optimizer state. This is fine, but the step_size modification update condition depends on prev_loss and current loss which means that one must pass both gradients and loss into the update and it seems the checks in the optimizer complain about this.
If you try using this you get some error like
Am wondering if a) this is a bad idea in general for other reasons b) if not bad, is it worth thinking about allowing this pattern?
Or maybe this multi-step pattern exists elsewhere in the optimizer code and I have not yet come across it.
Beta Was this translation helpful? Give feedback.
All reactions