Disregard custom pytree nodes in tree_util functions? #20729

dionhaefner · 2024-04-12T09:40:35Z

dionhaefner
Apr 12, 2024

I have a NamedTuple class that contains an array and an int indicating a shape. Because the shape should be treated as static (so it can be used within JIT to assemble output arrays) I registered the class as a custom pytree node.

Now I'm running into the issue that I can't get a tree structure of the class anymore without baking the static argument into it. This leads to wrong results when unflattening one object with the tree_def of another.

(I need to flatten/unflatten because I'm feeding the object into a TensorFlow data loading function that only allows for a tuple of arrays as output. Because of that restriction I can't pass the actual tree_def along and must rely on a reference structure instead. The real input object is a pytree with a-priori unknown structure.)

Example:

import jax
from typing import NamedTuple

# Class def
class MyNode(NamedTuple):
    value: jax.numpy.ndarray
    static_size: int

def mynode_flatten(node):
    return [node.value], [node.static_size]

def mynode_unflatten(aux, xs):
    return MyNode(xs[0], aux[0])

jax.tree_util.register_pytree_node(MyNode, mynode_flatten, mynode_unflatten)

# Store MyNode structure for unflattening later
inputs = MyNode(jax.numpy.zeros((1,)), 1)
global_def = jax.tree_util.tree_structure(inputs)
print(global_def)

tf_func = lambda x: x  # TensorFlow function that may only return a tuple of tensors

def tf_pipeline(inp):
    res = tf_func(lambda x: jax.tree_util.tree_flatten(x)[0])(inp)
    out = jax.tree_util.tree_unflatten(global_def, res)
    return out

res = tf_pipeline(inputs)
assert res == inputs  # ok

new_inputs = MyNode(jax.numpy.zeros((2,)), 2)
new_res = tf_pipeline(new_inputs)  # <- uses static_size from global_def
assert new_res == new_inputs, new_res  # not ok

Output:

AssertionError: MyNode(value=Array([0., 0.], dtype=float32), static_size=1)

What I would like is a tree_structure function that ignores the fact that the MyNode class is registered as a custom pytree node. But to my knowledge there's no way to "unregister" a pytree node. Is there a workaround for this?

Answered by jakevdp

Apr 15, 2024

Oh I see, I missed that you were overriding the default flattening rule. To answer your question: no, I don’t think there’s any way to have context-dependent flattening rules like what you have in mind.

View full answer

jakevdp · 2024-04-12T16:53:49Z

jakevdp
Apr 12, 2024
Maintainer

Why not use a local treedef rather than a global treedef?

def tf_pipeline(inp):
    treedef = None
    def func(x):
        nonlocal treedef
        x_flat, treedef = jax.tree_util.tree_flatten(x)
        return x_flat
    res = tf_func(func)(inp)
    out = jax.tree_util.tree_unflatten(treedef, res)
    return out

When I use this version of tf_pipeline, the assertions pass.

What I would like is a tree_structure function that ignores the fact that the MyNode class is registered as a custom pytree node. But to my knowledge there's no way to "unregister" a pytree node. Is there a workaround for this?

You can do this using the is_leaf argument of tree utilities; for example: is_leaf = lambda x: isinstance(x, MyNode). If you update your code to pass this to tree_structure and tree_flatten, then your tests also pass.

4 replies

dionhaefner Apr 14, 2024
Author

Thanks for the reply.

Why not use a local treedef rather than a global treedef?

I can't do that because the TF pipeline is strict about only operating on tuples of tensors, so I can't pass the tree def along (unless there's some TensorFlow-fu I'm not aware of).

Treating the class as a leaf isn't what I want either because then it doesn't get flattened at all.

The pytree def I want: (as you get when commenting out register_pytree_node)

PyTreeDef(CustomNode(namedtuple[MyNode], [*, *]))

The one I get with the code from OP:

PyTreeDef(CustomNode(MyNode[[1]], [*]))

The one I get with is_leaf = lambda x: isinstance(x, MyNode):

PyTreeDef(*)

I'm pretty sure the answer to question as posed ("is there a way to make tree_util pretend that a class wasn't registered as custom node?") is "no" but I figured I'd double check :)

jakevdp Apr 14, 2024
Maintainer

I'm pretty sure the answer to question as posed ("is there a way to make tree_util pretend that a class wasn't registered as custom node?") is "no" but I figured I'd double check :)

Did you see the second part of my answer where I show you how to do that?

dionhaefner Apr 14, 2024
Author

I did, but treating the class as a leaf is not the behavior I want - I want it to be treated as a regular namedtuple container that is being recursed into, just as if I hadn't called register_pytree_node on it.

(I realize this is also a custom rule, just a JAX internal one, so I apologize for the lack of precision in my statements earlier.)

jakevdp Apr 15, 2024
Maintainer

Oh I see, I missed that you were overriding the default flattening rule. To answer your question: no, I don’t think there’s any way to have context-dependent flattening rules like what you have in mind.

Answer selected by dionhaefner

aloscha1 · 2024-04-12T18:17:54Z

aloscha1
Apr 12, 2024

You could do it in a bit of a workaround, like:

import jax
import jax.numpy as jnp
from typing import NamedTuple

# Class def
class MyNode(NamedTuple):
    value: jnp.ndarray
    static_size: int

def mynode_flatten(node):
    return [node.value], [node.static_size]

def mynode_unflatten(aux, xs):
    return MyNode(xs[0], aux[0])

jax.tree_util.register_pytree_node(MyNode, mynode_flatten, mynode_unflatten)

# Wrapper function
def tf_pipeline_wrapper(inp):
    flattened_input = jax.tree_util.tree_flatten(inp)[0]
    tf_output = tf_func(flattened_input)  # Assuming tf_func is defined elsewhere
    return jax.tree_util.tree_unflatten(global_def, tf_output)

# Usage
inputs = MyNode(jax.numpy.zeros((1,)), 1)
global_def = jax.tree_util.tree_structure(inputs)
print(global_def)

res = tf_pipeline_wrapper(inputs)
assert res == inputs  # ok

new_inputs = MyNode(jax.numpy.zeros((2,)), 2)
new_res = tf_pipeline_wrapper(new_inputs)
assert new_res == new_inputs  # ok

This wrapper function tf_pipeline_wrapper takes an input of type MyNode, flattens it, passes the flattened structure to TensorFlow’s function (tf_func), and then unflattens the result using the global tree structure. This way, it bypasses the static size issue you have

2 replies

dionhaefner Apr 14, 2024
Author

Nope, test case still fails on my machine:

Traceback (most recent call last):
  File "/tmp/jaxweird/weird2.py", line 36, in <module>
    assert new_res == new_inputs  # ok
AssertionError

aloscha1 Apr 15, 2024

I see thanks.

Disregard custom pytree nodes in tree_util functions? #20729

Uh oh!

dionhaefner Apr 12, 2024

Replies: 2 comments · 6 replies

Uh oh!

Uh oh!

jakevdp Apr 12, 2024 Maintainer

Uh oh!

dionhaefner Apr 14, 2024 Author

Uh oh!

jakevdp Apr 14, 2024 Maintainer

Uh oh!

Uh oh!

dionhaefner Apr 14, 2024 Author

Uh oh!

jakevdp Apr 15, 2024 Maintainer

Uh oh!

Uh oh!

aloscha1 Apr 12, 2024

Uh oh!

dionhaefner Apr 14, 2024 Author

Uh oh!

aloscha1 Apr 15, 2024

dionhaefner
Apr 12, 2024

Replies: 2 comments 6 replies

jakevdp
Apr 12, 2024
Maintainer

dionhaefner Apr 14, 2024
Author

jakevdp Apr 14, 2024
Maintainer

dionhaefner Apr 14, 2024
Author

jakevdp Apr 15, 2024
Maintainer

aloscha1
Apr 12, 2024

dionhaefner Apr 14, 2024
Author