Skip to content

Support tuple variable names in _subset_list#148

Open
Chirag3841 wants to merge 4 commits intoarviz-devs:mainfrom
Chirag3841:var
Open

Support tuple variable names in _subset_list#148
Chirag3841 wants to merge 4 commits intoarviz-devs:mainfrom
Chirag3841:var

Conversation

@Chirag3841
Copy link

Check var_names behaviour and define what should be its type hint #83

xarray supports any hashable type as a variable or dimension name, including tuples such as ("tuple", "name"). This PR updates _subset_list to handle tuple names correctly, avoiding the current behavior where tuple inputs may be interpreted as multiple names.

Tuple inputs are treated as a single item when they exist in whole_list, otherwise they are treated as a container of names. Additionally, string-based filtering (filter_items="like" / "regex") is now applied only to string patterns and items to prevent type errors when non-string hashables are present. The membership validation was also updated to avoid NumPy failures with mixed hashable types.

Tests have been added to cover tuple variable name selection and filtering behavior.

@read-the-docs-community
Copy link

read-the-docs-community bot commented Feb 11, 2026

Comment on lines 73 to 77
subset: Hashable | Sequence[Hashable] | None,
whole_list: Sequence[Hashable],
filter_items: str | None = None,
warn: bool = True,
check_if_present: bool = True,
Copy link
Member

@OriolAbril OriolAbril Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only use explicit type hints when we want to have type hints that differ from the docstring, otherwise, to keep a single source of truth we stick to having the info on the docstring only. (note that docstub automatically translates that to proper type hints to add to the respective .pyi files)

if subset is not None:
if isinstance(subset, str):
subset = [subset]
elif isinstance(subset, tuple) and subset in whole_list:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used tuple in the issue as an example, but the whole point of the issue was to do a deeper investigation into the different potentially valid cases and how we want them to behave. If you restrict to tuple then we aren't really matching xarray's behaviour, see for example:

v = frozenset({"a", "b"})
ds = xr.Dataset({frozenset({"a", "b"}): (("dim",), [1, 2, 3])})
ds[v]
# out
# <xarray.DataArray frozenset({'b', 'a'}) (dim: 3)> Size: 24B
# array([1, 2, 3])
# Dimensions without coordinates: dim

Comment on lines +104 to +105
elif isinstance(subset, Sequence) and not isinstance(subset, str | bytes):
subset = list(subset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what this is doing

Comment on lines 126 to 130
real_items = [
real_item
for real_item in whole_list
if isinstance(real_item, str) and pattern in real_item
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this is what we want. IIUC, with this behaviour, if I use var_names="~theta", filter_vars="like" and I have as variable names ("theta", "original"), ("theta", "transformed"), and ("tau", "original") I end up plotting/keeping all the variables.

I think for like it would make more sense to exclude the first two. For regex I am much less sure if we want to try and do something complicated or keep things simple and ignore filter_vars completely in case of non-string elements.

Important note: This is a collaborative project and it is quite probably it will take a while until we all agree on a behaviour around this. I may have ideas, but me saying "I think this or that should happen" doesn't automatically mean this should be the behaviour of the library. It can be frustrating but you'll probably need some extra patience for this PR.

Copy link
Author

@Chirag3841 Chirag3841 Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification. I agree it’s better to align with xarray behavior and get consensus before finalizing anything. I’m happy to iterate based on feedback and adjust the implementation/tests as needed. Please let me know what target behavior you’d prefer and I can update the PR accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants