Why does custom pytree node produce node-in-node for `jax.jacfwd()`? #12656

agoose77 · 2022-10-04T21:08:44Z

agoose77
Oct 4, 2022

Given this trivial pytrees reproducer:

import jax
import numpy as np

jax.config.update("jax_platform_name", "cpu")


class SpecialArray:
    def __init__(self, array):
        self._array = array

    def __mul__(self, other):
        assert isinstance(other, (int, float))
        return SpecialArray(self._array * other)

    def __sub__(self, other):
        assert isinstance(other, (int, float))
        return SpecialArray(self._array - other)

    def __len__(self):
        return len(self._array)


def flatten_special(special):
    assert not isinstance(special._array, SpecialArray)
    return (special._array,), None


def unflatten_special(aux_data, leaves):
    return SpecialArray(leaves[0])


jax.tree_util.register_pytree_node(
    SpecialArray,
    flatten_special,
    unflatten_special,
)


def func(x):
    return x * 2 - 1


array = np.array([[0, 1, 2], [3, 4, 5.0]])
jac = jax.jacfwd(func)(array)

special = SpecialArray(array)
jac_special = jax.jacfwd(func)(special)
print(jac_special)

I'm observing that the jac_special case has the form

jac_special = SpecialArray(
    SpecialArray(
        <JACOBIAN>
    )
)

Why is JAX giving me my own node type as a leaf in unflatten_special?

Answered by jakevdp

Oct 4, 2022

The jacobian, as an operation, computes the derivative of each output element with respect to the input. So, for example, if you have a function that accepts an array of length n and returns an array of length m, you get m derivatives, each of length n. More concretely, a function that maps 2 inputs to 3 outputs will have a jacobian of shape (3, 2):

import jax
import jax.numpy as jnp

def f(x):  # input vector of length n
  return jnp.append(x, x.mean())  # output vector of length m = n + 1

x = jnp.array([1., 2.])  # n = 2
jax.jacrev(f)(x)  # output is length (m, n) = (3, 2)
# DeviceArray([[1. , 0. ],
#              [0. , 1. ],
#              [0.5, 0.5]], dtype=float32)

You can think of …

View full answer

jakevdp · 2022-10-04T22:35:04Z

jakevdp
Oct 4, 2022
Maintainer

The jacobian, as an operation, computes the derivative of each output element with respect to the input. So, for example, if you have a function that accepts an array of length n and returns an array of length m, you get m derivatives, each of length n. More concretely, a function that maps 2 inputs to 3 outputs will have a jacobian of shape (3, 2):

import jax
import jax.numpy as jnp

def f(x):  # input vector of length n
  return jnp.append(x, x.mean())  # output vector of length m = n + 1

x = jnp.array([1., 2.])  # n = 2
jax.jacrev(f)(x)  # output is length (m, n) = (3, 2)
# DeviceArray([[1. , 0. ],
#              [0. , 1. ],
#              [0.5, 0.5]], dtype=float32)

You can think of this as a length-3 array, in which each element is a length-2 array giving the gradient of that single output element with respect to the full input. Now how does this relate to pytrees? If your function inputs a pytree and outputs a tuple, your jacobian will be a tuple containing a pytree for each output value:

from typing import NamedTuple

class MyType(NamedTuple):
  x: jnp.ndarray
  y: jnp.ndarray

def f(a):
  return (a.x + a.y, a.x - a.y)

a = MyType(jnp.float32(1), jnp.float32(2))
jax.jacrev(f)(a)
# (MyType(x=DeviceArray(1., dtype=float32), y=DeviceArray(1., dtype=float32)),
#  MyType(x=DeviceArray(1., dtype=float32), y=DeviceArray(-1., dtype=float32)))

But what if instead of returning a tuple, you return another pytree? Well in that case, you still get a pytree per output, but instead of those pytrees being embedded in a tuple, those pytrees are embedded in a pytree:

def f(a):
  return MyType(a.x + a.y, a.x - a.y)

a = MyType(jnp.float32(1), jnp.float32(2))
jax.jacrev(f)(a)
# MyType(x=MyType(x=DeviceArray(1., dtype=float32), y=DeviceArray(1., dtype=float32)),
#        y=MyType(x=DeviceArray(1., dtype=float32), y=DeviceArray(-1., dtype=float32)))

The jacobian computes the gradient of each output value with respect to the full input. If you think about it this way, it's clear that when computing the jacobian of a function which maps a pytree to a pytree, a nested pytree is the most logical representation.

Does that make sense?

1 reply

agoose77 Oct 4, 2022
Author

Thanks @jakevdp! Had I gone to the lengths of including a "what I expect to happen" section, I would likely have realised this would be the case. I suspect I was tunneled on the problem rather than its context.

This has given me some new ideas in tackling the problem for which this example was a model. Thanks, as ever!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why does custom pytree node produce node-in-node for `jax.jacfwd()`? #12656

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Why does custom pytree node produce node-in-node for jax.jacfwd()? #12656

Uh oh!

Uh oh!

agoose77 Oct 4, 2022

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

jakevdp Oct 4, 2022 Maintainer

Uh oh!

agoose77 Oct 4, 2022 Author

Why does custom pytree node produce node-in-node for `jax.jacfwd()`? #12656

agoose77
Oct 4, 2022

Replies: 1 comment 1 reply

jakevdp
Oct 4, 2022
Maintainer

agoose77 Oct 4, 2022
Author