Skip to content

Commit 29a0a69

Browse files
authored
MNT: Update enum parsing (#68)
* MNT: Update enum parsing This aligns xncml enum parsing behavior with xarray's netCDF4 backend behavior.
1 parent dc7ab52 commit 29a0a69

File tree

3 files changed

+10
-13
lines changed

3 files changed

+10
-13
lines changed

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
**Breaking changes**
55
- Nested group handling:
66
Before this version, all groups were read, but conflicting variable names in-between groups would shadow data. Now, similarly to xarray ``open_dataset``, ``open_ncml`` accepts an optional ``group`` argument to specify which group should be read. When ``group`` is not specified, it defaults to the root group. Additionally ``group`` can be set to ``'*'`` so that every group is read and the hierarchy is flattened. In the event of conflicting variable/dimension names across groups, the conflicting name will be modified by appending ``'__n'`` where n is incremented.
7-
7+
- Enums are no longer transformed into CF flag_values and flag_meanings attributes, instead they are stored in the ``encoding["dtype"].metadata`` of their respective variable. This is aligned with what is done on xarray v2024.01.0
88

99
0.4.0 (2024-01-08)
1010
==================

tests/test_parser.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -333,8 +333,8 @@ def test_multiple_values_for_scalar():
333333
def test_read_enum():
334334
"""A enum should be turned into CF flag_values and flag_meanings attributes."""
335335
ds = xncml.open_ncml(data / 'testEnums.xml')
336-
assert ds['be_or_not_to_be'].attrs['flag_values'] == [0, 1]
337-
assert ds['be_or_not_to_be'].attrs['flag_meanings'] == ['false', 'true']
336+
assert ds.be_or_not_to_be.dtype.metadata['enum'] == {'false': 0, 'true': 1}
337+
assert ds.be_or_not_to_be.dtype.metadata['enum_name'] == 'boolean'
338338

339339

340340
def test_empty_attr():

xncml/parser.py

Lines changed: 7 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -459,20 +459,17 @@ def read_enum(obj: EnumTypedef) -> dict[str, list]:
459459
Returns
460460
-------
461461
dict:
462-
A dictionary with CF flag_values and flag_meanings that describe the Enum.
462+
A dictionary describing the Enum.
463463
"""
464-
return {
465-
'flag_values': list(map(lambda e: e.key, obj.content)),
466-
'flag_meanings': list(map(lambda e: e.content[0], obj.content)),
467-
}
464+
return {e.content[0]: e.key for e in obj.content}
468465

469466

470467
def read_variable(
471468
target: xr.Dataset,
472469
ref: xr.Dataset,
473470
obj: Variable,
474471
dimensions: dict,
475-
enums: dict,
472+
enums: dict[str, dict[str, int]],
476473
group_path: str,
477474
) -> xr.Dataset:
478475
"""
@@ -576,10 +573,10 @@ def read_variable(
576573
raise NotImplementedError
577574

578575
if obj.typedef in enums.keys():
579-
# TODO (@bzah): Update this once Enums are merged in xarray
580-
# https://github.com/pydata/xarray/pull/8147
581-
out.attrs['flag_values'] = enums[obj.typedef]['flag_values']
582-
out.attrs['flag_meanings'] = enums[obj.typedef]['flag_meanings']
576+
dtype = out.dtype
577+
new_dtype = np.dtype(dtype, metadata={'enum': enums[obj.typedef], 'enum_name': obj.typedef})
578+
out.encoding['dtype'] = new_dtype
579+
out = out.astype(new_dtype)
583580
elif obj.typedef is not None:
584581
raise NotImplementedError
585582
import re

0 commit comments

Comments
 (0)