refactor(module)!: A proposal to reorg modules and functions with new tree module and treespec module#186
refactor(module)!: A proposal to reorg modules and functions with new tree module and treespec module#186lqhuang wants to merge 13 commits intometaopt:mainfrom
tree module and treespec module#186Conversation
|
@XuehaiPan Hi, could you take a look on this proposal. What's your opinion? I only improve add a new Any feedback is welcome. It's also fine to reject! Never mind, I understand that. Thanks and appreciate your time! |
9d42e94 to
61e5e0d
Compare
tree module and treespec moduletree module and treespec module
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #186 +/- ##
==========================================
Coverage 100.00% 100.00%
==========================================
Files 36 39 +3
Lines 3786 4038 +252
==========================================
+ Hits 3786 4038 +252
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
|
@XuehaiPan I checked codebase of jax to find some inspirations and references Here is how jax refactors # https://github.com/jax-ml/jax/blob/0fb278a0b9152f341df69662c573d14f991fd51a/jax/_src/tree_util.py#L350
@export
def tree_map(f: Callable[..., Any],
tree: Any,
*rest: Any,
is_leaf: Callable[[Any], bool] | None = None) -> Any:
"""Alias of :func:`jax.tree.map`."""
leaves, treedef = tree_flatten(tree, is_leaf)
all_leaves = [leaves] + [treedef.flatten_up_to(r) for r in rest]
return treedef.unflatten(f(*xs) for xs in zip(*all_leaves))What I learned:
# https://github.com/jax-ml/jax/blob/0fb278a0b9152f341df69662c573d14f991fd51a/jax/_src/tree.py#L115C1-L156C1
def map(f: Callable[..., Any],
tree: Any,
*rest: Any,
is_leaf: Callable[[Any], bool] | None = None) -> Any:
"""Maps a multi-input function over pytree args to produce a new pytree.
Args:
f: function that takes ``1 + len(rest)`` arguments, to be applied at the
corresponding leaves of the pytrees.
tree: a pytree to be mapped over, with each leaf providing the first
positional argument to ``f``.
rest: a tuple of pytrees, each of which has the same structure as ``tree``
or has ``tree`` as a prefix.
is_leaf: an optionally specified function that will be called at each
flattening step. It should return a boolean, which indicates whether the
flattening should traverse the current object, or if it should be stopped
immediately, with the whole subtree being treated as a leaf.
Returns:
A new pytree with the same structure as ``tree`` but with the value at each
leaf given by ``f(x, *xs)`` where ``x`` is the value at the corresponding
leaf in ``tree`` and ``xs`` is the tuple of values at corresponding nodes in
``rest``.
Examples:
>>> import jax
>>> jax.tree.map(lambda x: x + 1, {"x": 7, "y": 42})
{'x': 8, 'y': 43}
If multiple inputs are passed, the structure of the tree is taken from the
first input; subsequent inputs need only have ``tree`` as a prefix:
>>> jax.tree.map(lambda x, y: [x] + y, [5, 6], [[7, 9], [1, 2]])
[[5, 7, 9], [6, 1, 2]]
See Also:
- :func:`jax.tree.leaves`
- :func:`jax.tree.reduce`
"""
return tree_util.tree_map(f, tree, *rest, is_leaf=is_leaf)What I learned:
FYI |
@lqhuang I think we are different than The rationale for having a separate jax.tree_util.tree_map(...)This is not as simple as: jax.tree.map(...)It trims However, for |
|
Thanks for your feedback! 😄
You're right. That's why I put the motivation
as the last one. It's not that important. On the other hand, The most important experiences I want to share here is I'm always trying to input first Second, if what I want is Don't worry, we could discuss more, and don't be in a hurry to draw a conclusion. After we have more clear scenarios and consensus, we would find proper solutions we both glad with, then you can decide the final approach, at least achieving a minimal goal (like only adding two module |
|
I hope to supplement more context. In general, my personal engineering practice is that we should embrace simplicity (not simple) and API should be intuitive. And I must point out a conflict of interest or the motivation only required by myself is my ongoing project heavily relies on operations of As same library developer and maintainer, I definitely understand the necessity of back forward compatibility and the pain of API evolution. And that is also where the community (like me) could help. When I'm a new lib user of Searching documentation obviously is a good habit. But what if we could do it better by using a well-structured namespace to avoid mistakes or cognitive burden w/o documentation. I expect that
Just like the abstraction of I may want more about your thoughts your opinions, including and not limited, for more concrete solutions or specific roadmap. |
|
I prefer to add a shortcut module for from optree import pytree
max_value = pytree.max({'a': 1, 'b': (2, 3), 'c': {'d': [4, 5], 'e': 6}})For functions prefixed with |
254bc32 to
0c5a33d
Compare
CI auto committed the code from I intentionally used instead of Because there exists My local ruff lint works fine $ make ruff
/Users/lqhuang/Git/optree/.venv/bin/python3 -m pip show ruff &>/dev/null || (cd && /Users/lqhuang/Git/optree/.venv/bin/python3 -m pip install --upgrade ruff)
/Users/lqhuang/Git/optree/.venv/bin/python3 -m ruff --version
ruff 0.9.4
/Users/lqhuang/Git/optree/.venv/bin/python3 -m ruff check .
All checks passed!But CI lint say it's an error? What's the different in config of CI env? |
It is modified by |
There was a problem hiding this comment.
Please revert the changes in this file. The optree.tree_* functions are still the standard APIs. We should test on these.
There was a problem hiding this comment.
Do you have any suggestion on how to reduce boilerplate tests?
I thought it. But after I make a copy of test_ops.py for optree.pytree, I find it would be noisy.
So I decided to test the wrapper directly.
I know it's still the standard, but in this way, we could test both two? If any change happens on optree.tree_*, from failed tests, we could also know we forget to update optree.pytree module.
There was a problem hiding this comment.
You could add a new file test_shortcut.py to test optree.pytree re-exports all tree_* functions.
def test_pytree_reexports():
assert set(optree.pytree.__all__) == {
name[len('tree_'):] for name in optree.__all__ if name.startswith('tree_')
}
for name in optree.pytree.__all__:
assert getattr(optree.pytree, name) is getattr(optree, f'tree_{name}')| none_is_leaf: bool = False, | ||
| namespace: str = '', | ||
| ) -> tuple[list[T], PyTreeSpec]: | ||
| """Flatten a pytree. |
There was a problem hiding this comment.
I still do not recommend redefining APIs with duplicate type annotations and docstrings. Just import and re-export the functions with a different name is enough. As for the function names in the docstrings (e.g., tree_flatten vs. flatten). It isn't a big problem.
Re-exporting the functions will also not need to handle the annotation problems of Tuple vs. tuple.
There was a problem hiding this comment.
In file optree/pytree.py, there is no annotation problem between Tuple and tuple.
I'm already using the approach that just re-exporting, so your concrete suggest is striping all type hints and docs?
There was a problem hiding this comment.
I'm already using the approach that just re-exporting, so your concrete suggest is striping all type hints and docs?
My suggestion is:
from optree.ops import (
tree_api1 as api1,
tree_api2 as api2,
tree_api3 as api3,
)
__all__ = [
'api1',
'api2',
'api3',
]There was a problem hiding this comment.
But the docs would be the same 🤔
There was a problem hiding this comment.
And also no .. versionadded:: 0.14.1 flag
There was a problem hiding this comment.
I'd prefer not to change the docstrings. #186 (comment)
There was a problem hiding this comment.
Sphinx can handle aliases. Readers can redirect to the docs of the original APIs.
https://optree.readthedocs.io/en/latest/typing.html#optree.PyTreeDef
There was a problem hiding this comment.
I'd prefer not to change the docstrings.
That only come into existence while you wish to never deprecate old APIs.
Of course, I understand every developer is trying to keep their own crafted codes 👍.
There was a problem hiding this comment.
Sphinx can handle aliases. Readers can redirect to the docs of the original APIs.
https://optree.readthedocs.io/en/latest/typing.html#optree.PyTreeDef
It's probably only useful when you mark tree_map (real implement) is an alias for pytree.map (with full docs).
There was a problem hiding this comment.
That only come into existence while you wish to never deprecate old APIs.
It is not only about backward compatibility but also makes the codebase easier to inspect. For example, you can easily find the implementation of tree_map by searching /tree_map/, while you will get many irrelevant occurrences when searching map.
08dcbca to
b351e73
Compare
I tried this, it's a syntax error. 🥹 |
You may need to add |
I fixed it already by other ways. It's not the |
cd85a56 to
ec74e45
Compare
|
Definitely you're the BOSS and owner 😂 I respect to your flavor style. I have opened a new PR with the simplest way. Please choose them as your option and review. Once accept one then close the another one. Thanks! |
|
Closing in favor of #189. |

Description
Detail changes have been added into implementated tasks.
Motivation and Context
I'm here to proposal a breaking change that reorg the functions like
tree_*andtreesepc_*into new moduletreeandtreespecPros:
optree.tree_*->tree.*tree/treespecjaxdid it either ...Cons:
treeandtreespecare quite common variable names, it increases the collision of naming.tree_map,tree_max...) will be conflict with the Python global namespace. We need to refactor carefully to avoid unexpected behaviors.Open questions:
Types of changes
What types of changes does your code introduce? Put an
xin all the boxes that apply:Implemented Tasks and Details
treemodule, and update doc string of functions1. To avoid conflicating with global builtins (
max,all, ...), I importedbuiltinsmodule and rewrote those functions tobuiltins.map(),builtins.max()explicitly.2. Keep poential conflict functions with
_treeprefix like_tree_map,_tree_leavesand redfine them at the end of module6. Check the last few line of
optree/tree.pyfor full list of export functions via redfinition way.tree_*functions inopsmoduletreemoduletreespecmodule, and update doc string of functionstreespec_functions inopsmoduletreespecmoduleChecklist
Go over all the following points, and put an
xin all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!
make format. (required)make lint. (required)make testpass. (required)Appendix
Post local test and lint results
My local env
OS: macOS (Apple Silicon)
Python: 3.12.7
CMake: 3.31.3
clang: 16.0.0
Lint results
All doc test under
/optree/optree/tree.pyis passedAll doc test under
/optree/optree/tree.pyis passedTesting results
Updates after first draft:
I agree that
I still create a module named as
treespec, but it only has constructor functions liketreespec.list,treespec.nametuple, ..., etc for creating an instance ofTreeSpec.Reimplment
pytree(previoustreemodule) as a thin wrapper layer ofoptree.tree_*with updated documentations.How to test: I replaced all
optree.tree_intopytree.intest_ops.py, assume that the wrapper works then the implementation must work too. Luckily I didn't find any varable named aspytreeintest_ops.py. Besides, I havn't find any test for functions likeoptree.treesepc_list, if I missed them mistakenly, please help me to locate them.Major lints and tests passed in my local env
Added documentations and CHANGELOG
Rebased from main
HEAD.pytree.module as wrappers forops.tree_*.pytreemoduletreespecmodule, and update doc string of functionsPR Checklist
make format. (required)make lint. (required)make testpass. (required)Examples of new doc page
Merge of current PR will close #189