Support tuple variable names in _subset_list by Chirag3841 · Pull Request #148 · arviz-devs/arviz-base

Chirag3841 · 2026-02-11T10:37:00Z

Check var_names behaviour and define what should be its type hint #83

xarray supports any hashable type as a variable or dimension name, including tuples such as ("tuple", "name"). This PR updates _subset_list to handle tuple names correctly, avoiding the current behavior where tuple inputs may be interpreted as multiple names.

Tuple inputs are treated as a single item when they exist in whole_list, otherwise they are treated as a container of names. Additionally, string-based filtering (filter_items="like" / "regex") is now applied only to string patterns and items to prevent type errors when non-string hashables are present. The membership validation was also updated to avoid NumPy failures with mixed hashable types.

Tests have been added to cover tuple variable name selection and filtering behavior.

read-the-docs-community · 2026-02-11T10:38:41Z

Documentation build overview

📚 arviz-base | 🛠️ Build #31430970 | 📁 Comparing e4acbea against latest (ca9a28f)

🔍 Preview build

Show files changed (18 files in total): 📝 18 modified | ➕ 0 added | ➖ 0 deleted

File	Status
api/index.html	📝 modified
tutorial/WorkingWithDataTree.html	📝 modified
api/generated/arviz_base.convert_to_dataset.html	📝 modified
api/generated/arviz_base.convert_to_datatree.html	📝 modified
api/generated/arviz_base.dataset_to_dataarray.html	📝 modified
api/generated/arviz_base.dataset_to_dataframe.html	📝 modified
api/generated/arviz_base.dict_to_dataset.html	📝 modified
api/generated/arviz_base.explode_dataset_dims.html	📝 modified
api/generated/arviz_base.extract.html	📝 modified
api/generated/arviz_base.from_cmdstanpy.html	📝 modified
api/generated/arviz_base.from_dict.html	📝 modified
api/generated/arviz_base.from_emcee.html	📝 modified
api/generated/arviz_base.from_numpyro.html	📝 modified
api/generated/arviz_base.load_arviz_data.html	📝 modified
api/generated/arviz_base.ndarray_to_dataarray.html	📝 modified
api/generated/arviz_base.references_to_dataset.html	📝 modified
api/generated/arviz_base.xarray_sel_iter.html	📝 modified
api/generated/arviz_base.xarray_var_iter.html	📝 modified

OriolAbril · 2026-02-12T15:04:00Z

src/arviz_base/utils.py

+    subset: Hashable | Sequence[Hashable] | None,
+    whole_list: Sequence[Hashable],
+    filter_items: str | None = None,
+    warn: bool = True,
+    check_if_present: bool = True,


we only use explicit type hints when we want to have type hints that differ from the docstring, otherwise, to keep a single source of truth we stick to having the info on the docstring only. (note that docstub automatically translates that to proper type hints to add to the respective .pyi files)

OriolAbril · 2026-02-12T15:07:24Z

src/arviz_base/utils.py

    if subset is not None:
        if isinstance(subset, str):
            subset = [subset]
+        elif isinstance(subset, tuple) and subset in whole_list:


I used tuple in the issue as an example, but the whole point of the issue was to do a deeper investigation into the different potentially valid cases and how we want them to behave. If you restrict to tuple then we aren't really matching xarray's behaviour, see for example:

v = frozenset({"a", "b"}) ds = xr.Dataset({frozenset({"a", "b"}): (("dim",), [1, 2, 3])}) ds[v] # out # <xarray.DataArray frozenset({'b', 'a'}) (dim: 3)> Size: 24B # array([1, 2, 3]) # Dimensions without coordinates: dim

OriolAbril · 2026-02-12T15:08:32Z

src/arviz_base/utils.py

+        elif isinstance(subset, Sequence) and not isinstance(subset, str | bytes):
+            subset = list(subset)


not sure what this is doing

OriolAbril · 2026-02-12T15:18:12Z

src/arviz_base/utils.py

+                        real_items = [
+                            real_item
+                            for real_item in whole_list
+                            if isinstance(real_item, str) and pattern in real_item
+                        ]


I am not sure this is what we want. IIUC, with this behaviour, if I use var_names="~theta", filter_vars="like" and I have as variable names ("theta", "original"), ("theta", "transformed"), and ("tau", "original") I end up plotting/keeping all the variables.

I think for like it would make more sense to exclude the first two. For regex I am much less sure if we want to try and do something complicated or keep things simple and ignore filter_vars completely in case of non-string elements.

Important note: This is a collaborative project and it is quite probably it will take a while until we all agree on a behaviour around this. I may have ideas, but me saying "I think this or that should happen" doesn't automatically mean this should be the behaviour of the library. It can be frustrating but you'll probably need some extra patience for this PR.

Thanks for the clarification. I agree it’s better to align with xarray behavior and get consensus before finalizing anything. I’m happy to iterate based on feedback and adjust the implementation/tests as needed. Please let me know what target behavior you’d prefer and I can update the PR accordingly.

Support tuple variable names in _subset_list

88116f0

Support tuple variable names in _subset_list

cefa874

OriolAbril reviewed Feb 12, 2026

View reviewed changes

Chirag3841 added 2 commits February 16, 2026 18:51

Handle non-string hashable names in _subset_list

049066b

Support non-string hashable names in _subset_list

e4acbea

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support tuple variable names in _subset_list#148

Support tuple variable names in _subset_list#148
Chirag3841 wants to merge 4 commits intoarviz-devs:mainfrom
Chirag3841:var

Chirag3841 commented Feb 11, 2026

Uh oh!

read-the-docs-community bot commented Feb 11, 2026 •

edited

Loading

Uh oh!

OriolAbril Feb 12, 2026 •

edited

Loading

Uh oh!

OriolAbril Feb 12, 2026

Uh oh!

OriolAbril Feb 12, 2026

Uh oh!

OriolAbril Feb 12, 2026

Uh oh!

Chirag3841 Feb 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		elif isinstance(subset, Sequence) and not isinstance(subset, str \| bytes):
		subset = list(subset)

Uh oh!

Conversation

Chirag3841 commented Feb 11, 2026

Uh oh!

read-the-docs-community bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Documentation build overview

Uh oh!

OriolAbril Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

OriolAbril Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

OriolAbril Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

OriolAbril Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Chirag3841 Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

read-the-docs-community bot commented Feb 11, 2026 •

edited

Loading

OriolAbril Feb 12, 2026 •

edited

Loading

Chirag3841 Feb 12, 2026 •

edited

Loading