Forward over forward mode HVPs #7669

david-klindt · 2021-08-19T11:10:53Z

david-klindt
Aug 19, 2021

Following the Autodiff cookbook, there are three variants for computing Hessian vector products (HVPs): a) forward-over-reverse, b) reverse-over-forward and c) reverse-over-reverse. Clearly, a fourth option is missing: d) forward-over-forward. I tried to implement this in the same style, however, it does not work. Is this a conceputal problem or is there a mistake in the code?

Here is a minimal example (jax.version = 0.2.12):

from jax import jvp, grad, hessian
from jax import random

key = random.PRNGKey(0)

def f(X):
return jnp.sum(jnp.tanh(X)**2)

key, subkey1, subkey2 = random.split(key, 3)
X = random.normal(subkey1, (30, 40))
V = random.normal(subkey2, (30, 40))

correct_answer = jnp.tensordot(hessian(f)(X), V, 2)

def hvp_fwdrev(f, primals, tangents):
return jvp(grad(f), primals, tangents)[1]

def hvp_revfwd(f, primals, tangents):
g = lambda primals: jvp(f, primals, tangents)[1]
return grad(g)(primals)

def hvp_revrev(f, primals, tangents):
x, = primals
v, = tangents
return grad(lambda x: jnp.vdot(grad(f)(x), v))(x)

def hvp_fwdfwd(f, primals, tangents):
g = lambda primals: jvp(f, (primals,), tangents)[1]
return jvp(g, primals, tangents)[1]

print("Forward over reverse, correct", jnp.allclose(
correct_answer, hvp_fwdrev(f, (X,), (V,)), 1e-4, 1e-4))

print("Reverse over forward, correct", jnp.allclose(
correct_answer, hvp_revfwd(f, (X,), (V,)), 1e-4, 1e-4))

print("Reverse over reverse, correct", jnp.allclose(
correct_answer, hvp_revrev(f, (X,), (V,)), 1e-4, 1e-4))

print("Forward over forward, correct", jnp.allclose(
correct_answer, hvp_fwdfwd(f, (X,), (V,)), 1e-4, 1e-4))

Output:
Forward over reverse, correct True
Reverse over forward, correct True
Reverse over reverse, correct True
Forward over forward, correct False

mattjj · 2021-08-19T16:49:45Z

mattjj
Aug 19, 2021
Maintainer

Thanks for the question!

Actually, there's a bug in the code. Check the dimension of these results:

print(jnp.ndim(correct_answer))  # 2
print(jnp.ndim(hvp_fwdfwd(f, (X,), (V,))))  # 0

A hint is that in the implementation above you're using tangents twice, but certainly the HVP should be linear in the vector, so we should only get to use tangents once:

def hvp_fwdfwd(f, primals, tangents):
  g = lambda primals: jvp(f, (primals,), tangents)[1]
  return jvp(g, primals, tangents)[1]

Another clue is dimensionality: for a function h, jvp(h, primals, tangents)[1] has the same dimension as h(*primals). But here f is scalar-valued, which means so is g, and so is jvp(g, primals, tangents)[1]. We need to get an input-sized vector out somehow!

In general I don't think we can get an HVP using just two applications of jvp, because fundamentally we don't need just to push information forward: we need to pull it back to the input space. We can do it with vmap though. Here's a revised script:

from jax import jvp, grad, hessian, jacfwd
from jax import random
import jax.numpy as jnp

key = random.PRNGKey(0)

def f(X):
  return jnp.sum(jnp.tanh(X)**2)

key, subkey1, subkey2 = random.split(key, 3)
X = random.normal(subkey1, (30, 40))
V = random.normal(subkey2, (30, 40))

correct_answer = jnp.tensordot(hessian(f)(X), V, 2)

def hvp_fwdrev(f, primals, tangents):
  return jvp(grad(f), primals, tangents)[1]

def hvp_revfwd(f, primals, tangents):
  g = lambda primals: jvp(f, primals, tangents)[1]
  return grad(g)(primals)

def hvp_revrev(f, primals, tangents):
  x, = primals
  v, = tangents
  return grad(lambda x: jnp.vdot(grad(f)(x), v))(x)

def hvp_fwdfwd(f, primals, tangents):
  g = lambda *primals: jvp(f, primals, tangents)[1]
  return jacfwd(g)(*primals)

print("Forward over reverse, correct", jnp.allclose(
correct_answer, hvp_fwdrev(f, (X,), (V,)), 1e-4, 1e-4))

print("Reverse over forward, correct", jnp.allclose(
correct_answer, hvp_revfwd(f, (X,), (V,)), 1e-4, 1e-4))

print("Reverse over reverse, correct", jnp.allclose(
correct_answer, hvp_revrev(f, (X,), (V,)), 1e-4, 1e-4))

print("Forward over forward, correct", jnp.allclose(
correct_answer, hvp_fwdfwd(f, (X,), (V,)), 1e-4, 1e-4))

1 reply

david-klindt Aug 20, 2021
Author

Thanks that was very helpful!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Forward over forward mode HVPs #7669

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Forward over forward mode HVPs #7669

Uh oh!

Uh oh!

david-klindt Aug 19, 2021

Replies: 1 comment · 1 reply

Uh oh!

mattjj Aug 19, 2021 Maintainer

Uh oh!

david-klindt Aug 20, 2021 Author

david-klindt
Aug 19, 2021

Replies: 1 comment 1 reply

mattjj
Aug 19, 2021
Maintainer

david-klindt Aug 20, 2021
Author