Skip to content

Commit 4a74983

Browse files
committed
Merge branch 'main' into forman-config_revision
# Conflicts: # CHANGES.md # tests/test_config.py
2 parents 2f5b8c1 + 4c3ed27 commit 4a74983

File tree

18 files changed

+539
-32
lines changed

18 files changed

+539
-32
lines changed

CHANGES.md

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,18 @@
1414
- Added class method `from_config()` to `ConfigList`.
1515
- Removed function `xrlint.config.merge_configs` as it was no longer used.
1616

17-
## Version 0.4.1 (in development)
17+
## Version 0.4.1 (from 2025-01-31)
18+
19+
### Changes
20+
21+
- Added core rule `conventions` that checks for the `Conventions`attribute.
22+
- Added core rule `context-descr` that checks content description
23+
- Added core rule `var-descr` that checks data variable description
24+
- Renamed rules for consistency:
25+
- `var-units-attrs` and `var-units`
26+
- `flags` into `var-flags`
27+
28+
### Fixes
1829

1930
- Fixed an issue that prevented recursively traversing folders referred
2031
to by URLs (such as `s3://<bucket>/<path>/`) rather than local directory

docs/rule-ref.md

Lines changed: 38 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,27 @@ New rules will be added by upcoming XRLint releases.
55

66
## Core Rules
77

8+
### :material-lightbulb: `content-desc`
9+
10+
A dataset should provide information about where the data came from and what has been done to it. This information is mainly for the benefit of human readers. The rule accepts the following configuration parameters:
11+
12+
- `globals`: list of names of required global attributes. Defaults to `['title', 'history']`.
13+
- `commons`: list of names of required variable attributes that can also be defined globally. Defaults to `['institution', 'source', 'references', 'comment']`.
14+
- `no_vars`: do not check variables at all. Defaults to `False`.
15+
- `ignored_vars`: list of ignored variables (regex patterns). Defaults to `['crs', 'spatial_ref']`.
16+
17+
[:material-information-variant:](https://cfconventions.org/cf-conventions/cf-conventions.html#description-of-file-contents)
18+
19+
Contained in: `all`-:material-lightning-bolt: `recommended`-:material-alert:
20+
21+
### :material-lightbulb: `conventions`
22+
23+
Datasets should identify the applicable conventions using the `Conventions` attribute.
24+
The rule has an optional configuration parameter `match` which is a regex pattern that the value of the `Conventions` attribute must match, if any. If not provided, the rule just verifies that the attribute exists and whether it is a character string.
25+
[:material-information-variant:](https://cfconventions.org/cf-conventions/cf-conventions.html#identification-of-conventions)
26+
27+
Contained in: `all`-:material-lightning-bolt: `recommended`-:material-alert:
28+
829
### :material-bug: `coords-for-dims`
930

1031
Dimensions of data variables should have corresponding coordinates.
@@ -17,13 +38,6 @@ Datasets should be given a non-empty title.
1738

1839
Contained in: `all`-:material-lightning-bolt: `recommended`-:material-alert:
1940

20-
### :material-lightbulb: `flags`
21-
22-
Validate attributes 'flag_values', 'flag_masks' and 'flag_meanings' that make variables that contain flag values self describing.
23-
[:material-information-variant:](https://cfconventions.org/cf-conventions/cf-conventions.html#flags)
24-
25-
Contained in: `all`-:material-lightning-bolt: `recommended`-:material-lightning-bolt:
26-
2741
### :material-bug: `grid-mappings`
2842

2943
Grid mappings, if any, shall have valid grid mapping coordinate variables.
@@ -64,9 +78,24 @@ Time coordinates should have valid and unambiguous time units encoding.
6478

6579
Contained in: `all`-:material-lightning-bolt: `recommended`-:material-lightning-bolt:
6680

67-
### :material-lightbulb: `var-units-attr`
81+
### :material-lightbulb: `var-desc`
82+
83+
Check that each data variable provides an identification and description of the content. The rule can be configured by parameter `attrs` which is a list of names of attributes that provides descriptive information. It defaults to `['standard_name', 'long_name']`.
84+
[:material-information-variant:](https://cfconventions.org/cf-conventions/cf-conventions.html#standard-name)
85+
86+
Contained in: `all`-:material-lightning-bolt: `recommended`-:material-alert:
87+
88+
### :material-lightbulb: `var-flags`
89+
90+
Validate attributes 'flag_values', 'flag_masks' and 'flag_meanings' that make variables that contain flag values self describing.
91+
[:material-information-variant:](https://cfconventions.org/cf-conventions/cf-conventions.html#flags)
92+
93+
Contained in: `all`-:material-lightning-bolt: `recommended`-:material-lightning-bolt:
94+
95+
### :material-lightbulb: `var-units`
6896

69-
Every variable should have a valid 'units' attribute.
97+
Every variable should provide a description of its units.
98+
[:material-information-variant:](https://cfconventions.org/cf-conventions/cf-conventions.html#units)
7099

71100
Contained in: `all`-:material-lightning-bolt: `recommended`-:material-alert:
72101

docs/todo.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,13 @@
1010
- use mkdocstrings ref syntax in docstrings
1111
- provide configuration examples (use as tests?)
1212
- add `docs_url` to all existing rules
13+
- API changes for v0.5:
14+
- clarify when users can pass configuration objects like values
15+
and when configuration like values
16+
- config class naming is confusing,
17+
change `Config` -> `ConfigObject`, `ConfigList` -> `Config`
18+
- Change `verify` -> `validate`,
19+
prefix `RuleOp` methods by `validate_` for clarity.
1320

1421
## Desired
1522

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
import xarray as xr
2+
3+
from xrlint.plugins.core.rules.content_desc import ContentDesc
4+
from xrlint.testing import RuleTest, RuleTester
5+
6+
global_attrs = dict(
7+
title="OC-Climatology",
8+
history="2025-01-26: created",
9+
)
10+
11+
common_attrs = dict(
12+
institution="ESA",
13+
source="a.nc; b.nc",
14+
references="!",
15+
comment="?",
16+
)
17+
18+
all_attrs = global_attrs | common_attrs
19+
20+
time_coord = xr.DataArray(
21+
[1, 2, 3], dims="time", attrs=dict(units="days since 2025-01-01")
22+
)
23+
24+
valid_dataset_0 = xr.Dataset(
25+
attrs=all_attrs,
26+
data_vars=dict(chl=xr.DataArray([1, 2, 3], dims="time", attrs=dict())),
27+
coords=dict(time=time_coord),
28+
)
29+
valid_dataset_1 = xr.Dataset(
30+
attrs=global_attrs,
31+
data_vars=dict(chl=xr.DataArray([1, 2, 3], dims="time", attrs=common_attrs)),
32+
coords=dict(time=time_coord),
33+
)
34+
valid_dataset_1a = xr.Dataset(
35+
attrs=global_attrs,
36+
data_vars=dict(
37+
chl=xr.DataArray([1, 2, 3], dims="time", attrs=common_attrs),
38+
crs=xr.DataArray(0, attrs=dict(grid_mapping_name="...")),
39+
),
40+
coords=dict(time=time_coord),
41+
)
42+
valid_dataset_1b = xr.Dataset(
43+
attrs=global_attrs,
44+
data_vars=dict(
45+
chl=xr.DataArray([1, 2, 3], dims="time", attrs=common_attrs),
46+
chl_unc=xr.DataArray(0, attrs=dict(units="...")),
47+
),
48+
coords=dict(time=time_coord),
49+
)
50+
valid_dataset_2 = xr.Dataset(
51+
attrs=global_attrs,
52+
data_vars=dict(chl=xr.DataArray([1, 2, 3], dims="time", attrs=dict())),
53+
coords=dict(time=time_coord),
54+
)
55+
valid_dataset_3 = xr.Dataset(
56+
attrs=global_attrs,
57+
data_vars=dict(
58+
chl=xr.DataArray([1, 2, 3], dims="time", attrs=dict(description="Bla!"))
59+
),
60+
coords=dict(time=time_coord),
61+
)
62+
63+
invalid_dataset_0 = xr.Dataset()
64+
invalid_dataset_1 = xr.Dataset(
65+
attrs=dict(),
66+
data_vars=dict(chl=xr.DataArray([1, 2, 3], dims="time", attrs=dict())),
67+
coords=dict(time=time_coord),
68+
)
69+
invalid_dataset_2 = xr.Dataset(
70+
attrs=global_attrs,
71+
data_vars=dict(chl=xr.DataArray([1, 2, 3], dims="time", attrs=dict())),
72+
coords=dict(time=time_coord),
73+
)
74+
75+
ContentDescTest = RuleTester.define_test(
76+
"content-desc",
77+
ContentDesc,
78+
valid=[
79+
RuleTest(dataset=valid_dataset_0, name="0"),
80+
RuleTest(dataset=valid_dataset_1, name="1"),
81+
RuleTest(dataset=valid_dataset_1a, name="1a"),
82+
RuleTest(
83+
dataset=valid_dataset_1b, name="1b", kwargs={"ignored_vars": ["chl_unc"]}
84+
),
85+
RuleTest(dataset=valid_dataset_2, name="2", kwargs={"commons": []}),
86+
RuleTest(
87+
dataset=valid_dataset_2, name="2", kwargs={"commons": [], "skip_vars": True}
88+
),
89+
RuleTest(
90+
dataset=valid_dataset_3, name="3", kwargs={"commons": ["description"]}
91+
),
92+
],
93+
invalid=[
94+
RuleTest(dataset=invalid_dataset_0, expected=2),
95+
RuleTest(dataset=invalid_dataset_1, expected=6),
96+
RuleTest(dataset=invalid_dataset_2, kwargs={"skip_vars": True}, expected=4),
97+
],
98+
)
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
import xarray as xr
2+
3+
from xrlint.plugins.core.rules.conventions import Conventions
4+
from xrlint.testing import RuleTest, RuleTester
5+
6+
valid_dataset_0 = xr.Dataset(attrs=dict(Conventions="CF-1.10"))
7+
8+
invalid_dataset_0 = xr.Dataset()
9+
invalid_dataset_1 = xr.Dataset(attrs=dict(Conventions=1.12))
10+
invalid_dataset_2 = xr.Dataset(attrs=dict(Conventions="CF 1.10"))
11+
12+
13+
ConventionsTest = RuleTester.define_test(
14+
"conventions",
15+
Conventions,
16+
valid=[
17+
RuleTest(dataset=valid_dataset_0),
18+
RuleTest(dataset=valid_dataset_0, kwargs={"match": r"CF-.*"}),
19+
],
20+
invalid=[
21+
RuleTest(
22+
dataset=invalid_dataset_0,
23+
expected=["Missing attribute 'Conventions'."],
24+
),
25+
RuleTest(
26+
dataset=invalid_dataset_1,
27+
expected=["Invalid attribute 'Conventions': 1.12."],
28+
),
29+
RuleTest(
30+
dataset=invalid_dataset_2,
31+
kwargs={"match": r"CF-.*"},
32+
expected=[
33+
"Invalid attribute 'Conventions': 'CF 1.10' doesn't match 'CF-.*'."
34+
],
35+
),
36+
],
37+
)
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
import xarray as xr
2+
3+
from xrlint.plugins.core.rules.var_desc import VarDesc
4+
from xrlint.testing import RuleTest, RuleTester
5+
6+
pressure_attrs = dict(
7+
long_name="mean sea level pressure",
8+
units="hPa",
9+
standard_name="air_pressure_at_sea_level",
10+
)
11+
12+
time_coord = xr.DataArray(
13+
[1, 2, 3], dims="time", attrs=dict(units="days since 2025-01-01")
14+
)
15+
16+
valid_dataset_0 = xr.Dataset(
17+
coords=dict(time=time_coord),
18+
)
19+
valid_dataset_1 = xr.Dataset(
20+
data_vars=dict(pressure=xr.DataArray([1, 2, 3], dims="time", attrs=pressure_attrs)),
21+
coords=dict(time=time_coord),
22+
)
23+
valid_dataset_2 = xr.Dataset(
24+
data_vars=dict(
25+
chl=xr.DataArray(
26+
[1, 2, 3], dims="time", attrs=dict(description="It is air pressure")
27+
)
28+
),
29+
coords=dict(time=time_coord),
30+
)
31+
32+
invalid_dataset_0 = xr.Dataset(
33+
attrs=dict(),
34+
data_vars=dict(chl=xr.DataArray([1, 2, 3], dims="time", attrs=dict())),
35+
coords=dict(time=time_coord),
36+
)
37+
38+
invalid_dataset_1 = xr.Dataset(
39+
attrs=dict(),
40+
data_vars=dict(
41+
chl=xr.DataArray(
42+
[1, 2, 3],
43+
dims="time",
44+
attrs=dict(standard_name="air_pressure_at_sea_level"),
45+
)
46+
),
47+
coords=dict(time=time_coord),
48+
)
49+
invalid_dataset_2 = xr.Dataset(
50+
attrs=dict(),
51+
data_vars=dict(
52+
chl=xr.DataArray(
53+
[1, 2, 3], dims="time", attrs=dict(long_name="mean sea level pressure")
54+
)
55+
),
56+
coords=dict(time=time_coord),
57+
)
58+
invalid_dataset_3 = xr.Dataset(
59+
attrs=dict(),
60+
data_vars=dict(chl=xr.DataArray([1, 2, 3], dims="time", attrs=pressure_attrs)),
61+
coords=dict(time=time_coord),
62+
)
63+
64+
VarDescTest = RuleTester.define_test(
65+
"var-desc",
66+
VarDesc,
67+
valid=[
68+
RuleTest(dataset=valid_dataset_0),
69+
RuleTest(dataset=valid_dataset_1),
70+
RuleTest(dataset=valid_dataset_2, kwargs={"attrs": ["description"]}),
71+
],
72+
invalid=[
73+
RuleTest(
74+
dataset=invalid_dataset_0,
75+
expected=[
76+
"Missing attribute 'standard_name'.",
77+
"Missing attribute 'long_name'.",
78+
],
79+
),
80+
RuleTest(
81+
dataset=invalid_dataset_1, expected=["Missing attribute 'long_name'."]
82+
),
83+
RuleTest(
84+
dataset=invalid_dataset_2, expected=["Missing attribute 'standard_name'."]
85+
),
86+
RuleTest(
87+
dataset=invalid_dataset_3,
88+
kwargs={"attrs": ["description"]},
89+
expected=["Missing attribute 'description'."],
90+
),
91+
],
92+
)

tests/plugins/core/rules/test_flags.py renamed to tests/plugins/core/rules/test_var_flags.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import numpy as np
22
import xarray as xr
33

4-
from xrlint.plugins.core.rules.flags import Flags
4+
from xrlint.plugins.core.rules.var_flags import VarFlags
55
from xrlint.testing import RuleTest, RuleTester
66

77
valid_dataset_0 = xr.Dataset()
@@ -73,9 +73,9 @@
7373
np.float64
7474
)
7575

76-
FlagsTest = RuleTester.define_test(
77-
"flags",
78-
Flags,
76+
VarFlagsTest = RuleTester.define_test(
77+
"var-flags",
78+
VarFlags,
7979
valid=[
8080
RuleTest(dataset=valid_dataset_0),
8181
RuleTest(dataset=valid_dataset_1),

tests/plugins/core/rules/test_var_units_attr.py renamed to tests/plugins/core/rules/test_var_units.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
import xarray as xr
22

3-
from xrlint.plugins.core.rules.var_units_attr import VarUnitsAttr
3+
from xrlint.plugins.core.rules.var_units import VarUnits
44
from xrlint.testing import RuleTest, RuleTester
55

66
valid_dataset_1 = xr.Dataset()
@@ -19,9 +19,9 @@
1919
invalid_dataset_3.v.attrs = {"units": 1}
2020

2121

22-
VarUnitsAttrTest = RuleTester.define_test(
23-
"var-units-attr",
24-
VarUnitsAttr,
22+
VarUnitsTest = RuleTester.define_test(
23+
"var-units",
24+
VarUnits,
2525
valid=[
2626
RuleTest(dataset=valid_dataset_1),
2727
RuleTest(dataset=valid_dataset_2),

tests/plugins/core/test_plugin.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,19 @@ def test_rules_complete(self):
88
plugin = export_plugin()
99
self.assertEqual(
1010
{
11+
"content-desc",
12+
"conventions",
1113
"coords-for-dims",
1214
"dataset-title-attr",
13-
"flags",
1415
"grid-mappings",
1516
"lat-coordinate",
1617
"lon-coordinate",
1718
"no-empty-attrs",
1819
"time-coordinate",
1920
"no-empty-chunks",
20-
"var-units-attr",
21+
"var-desc",
22+
"var-flags",
23+
"var-units",
2124
},
2225
set(plugin.rules.keys()),
2326
)

0 commit comments

Comments
 (0)