How to get gradient from the loss of two gradients? #7152

stolet · 2021-07-01T01:38:01Z

stolet
Jul 1, 2021

I have two set of parameters, theta from a model created using haiku, and w a set of weights. When training the model, I compute gradient updates wrt to theta on two different datasets yielding grad_f_s and grad_f_d. After I compute these two gradients I want to compute a loss based on these two gradients and get gradient updates for this loss wrt to w (the weights w are used when computing grad_f_s). The problem I am encountering is that when I do this grad_w is all zeroes, even though grad_loss is not zero. I was wondering how to properly compute a gradient update from a loss computed from two other gradients.

grad_w = jax.grad(grad_loss, 0)(w, theta, grad_f_d, model, inp, target)

def grad_loss(w, theta, grad_f_d, model, inp, target):
   pred = jax.jit(model.apply)(theta, inp)
   softplus_w = jax.nn.softplus(w)
   grad_f_s = jax.grad(mse_loss_weighted, argnums=2)(pred, target, theta, w) 

   dot_norms = jax.tree_map(map_dot_norm, grad_f_d, grad_f_s)
   loss_sum = jax.tree_util.tree_reduce(lambda agg, x: agg + x,
                                         dot_norms,
                                         0)
    return loss_sum

def mse_loss_weighted(pred, target, theta, w):
    loss = jnp.power(pred - target, 2)
    loss_weighted = loss * w
    return jnp.sum(loss_weighted)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to get gradient from the loss of two gradients? #7152

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

How to get gradient from the loss of two gradients? #7152

Uh oh!

stolet Jul 1, 2021

Replies: 0 comments

stolet
Jul 1, 2021