Skip to content

Commit 7e9b4f8

Browse files
peanutfunchahank
andauthored
Allow reading Hazard events that are not dates from xarray (#837)
* Allow reading Hazard events that are not dates from xarray * Set default value of `Hazard.event_name` to empty string. * Try interpreting values of the event coordinate as dates or ordinals for default values of `Hazard.date`. If that fails, issue a warning and set default values to zeros. * Update tests. * Try to read event coordinate as date * Update climada/hazard/base.py * Apply formatter to test_base_xarray.py * Set default ordinal to 1, fix tests * Fix linter warnings * Switch back to class setup for xarray tests * Clarify docstring of Hazard.from_xarray_raster * Update CHANGELOG.md * Update climada/hazard/base.py --------- Co-authored-by: Chahan M. Kropf <[email protected]>
1 parent 104dfdb commit 7e9b4f8

File tree

3 files changed

+90
-17
lines changed

3 files changed

+90
-17
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ Code freeze date: YYYY-MM-DD
2020
- Update `CONTRIBUTING.md` to better explain types of contributions to this repository [#797](https://github.com/CLIMADA-project/climada_python/pull/797)
2121
- The default tile layer in Exposures maps is not Stamen Terrain anymore, but [CartoDB Positron](https://github.com/CartoDB/basemap-styles). Affected methods are `climada.engine.Impact.plot_basemap_eai_exposure`,`climada.engine.Impact.plot_basemap_impact_exposure` and `climada.entity.Exposures.plot_basemap`. [#798](https://github.com/CLIMADA-project/climada_python/pull/798)
2222
- Recommend using Mamba instead of Conda for installing CLIMADA [#809](https://github.com/CLIMADA-project/climada_python/pull/809)
23+
- `Hazard.from_xarray_raster` now allows arbitrary values as 'event' coordinates [#837](https://github.com/CLIMADA-project/climada_python/pull/837)
2324

2425
### Fixed
2526

climada/hazard/base.py

Lines changed: 61 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -463,12 +463,13 @@ def from_xarray_raster(
463463
):
464464
"""Read raster-like data from an xarray Dataset
465465
466-
This method reads data that can be interpreted using three coordinates for event,
467-
latitude, and longitude. The data and the coordinates themselves may be organized
468-
in arbitrary dimensions in the Dataset (e.g. three dimensions 'year', 'month',
469-
'day' for the coordinate 'event'). The three coordinates to be read can be
470-
specified via the ``coordinate_vars`` parameter. See Notes and Examples if you
471-
want to load single-event data that does not contain an event dimension.
466+
This method reads data that can be interpreted using three coordinates: event,
467+
latitude, and longitude. The names of the coordinates to be read from the
468+
dataset can be specified via the ``coordinate_vars`` parameter. The data and the
469+
coordinates themselves may be organized in arbitrary dimensions (e.g. two
470+
dimensions 'year' and 'altitude' for the coordinate 'event'). See Notes and
471+
Examples if you want to load single-event data that does not contain an event
472+
dimension.
472473
473474
The only required data is the intensity. For all other data, this method can
474475
supply sensible default values. By default, this method will try to find these
@@ -513,12 +514,14 @@ def from_xarray_raster(
513514
514515
Default values are:
515516
516-
* ``date``: The ``event`` coordinate interpreted as date
517+
* ``date``: The ``event`` coordinate interpreted as date or ordinal, or
518+
ones if that fails (which will issue a warning).
517519
* ``fraction``: ``None``, which results in a value of 1.0 everywhere, see
518520
:py:meth:`Hazard.__init__` for details.
519521
* ``hazard_type``: Empty string
520522
* ``frequency``: 1.0 for every event
521-
* ``event_name``: String representation of the event time
523+
* ``event_name``: String representation of the event date or empty strings
524+
if that fails (which will issue a warning).
522525
* ``event_id``: Consecutive integers starting at 1 and increasing with time
523526
crs : str, optional
524527
Identifier for the coordinate reference system of the coordinates. Defaults
@@ -553,13 +556,16 @@ def from_xarray_raster(
553556
and Examples) before loading the Dataset as Hazard.
554557
* Single-valued data for variables ``frequency``. ``event_name``, and
555558
``event_date`` will be broadcast to every event.
559+
* The ``event`` coordinate may take arbitrary values. In case these values
560+
cannot be interpreted as dates or date ordinals, the default values for
561+
``Hazard.date`` and ``Hazard.event_name`` are used, see the
562+
``data_vars``` parameter documentation above.
556563
* To avoid confusion in the call signature, several parameters are keyword-only
557564
arguments.
558565
* The attributes ``Hazard.haz_type`` and ``Hazard.unit`` currently cannot be
559566
read from the Dataset. Use the method parameters to set these attributes.
560567
* This method does not read coordinate system metadata. Use the ``crs`` parameter
561568
to set a custom coordinate system identifier.
562-
* This method **does not** read lazily. Single data arrays must fit into memory.
563569
564570
Examples
565571
--------
@@ -802,14 +808,48 @@ def strict_positive_int_accessor(array: xr.DataArray) -> np.ndarray:
802808
raise ValueError(f"'{array.name}' data must be larger than zero")
803809
return array.values
804810

805-
def date_to_ordinal_accessor(array: xr.DataArray) -> np.ndarray:
811+
def date_to_ordinal_accessor(
812+
array: xr.DataArray, strict: bool = True
813+
) -> np.ndarray:
806814
"""Take a DataArray and transform it into ordinals"""
807-
if np.issubdtype(array.dtype, np.integer):
808-
# Assume that data is ordinals
809-
return strict_positive_int_accessor(array)
815+
try:
816+
if np.issubdtype(array.dtype, np.integer):
817+
# Assume that data is ordinals
818+
return strict_positive_int_accessor(array)
819+
820+
# Try transforming to ordinals
821+
return np.array(u_dt.datetime64_to_ordinal(array.values))
822+
823+
# Handle access errors
824+
except (ValueError, TypeError) as err:
825+
if strict:
826+
raise err
827+
828+
LOGGER.warning(
829+
"Failed to read values of '%s' as dates or ordinals. Hazard.date "
830+
"will be ones only",
831+
array.name,
832+
)
833+
return np.ones(array.shape)
834+
835+
def year_month_day_accessor(
836+
array: xr.DataArray, strict: bool = True
837+
) -> np.ndarray:
838+
"""Take an array and return am array of YYYY-MM-DD strings"""
839+
try:
840+
return array.dt.strftime("%Y-%m-%d").values
841+
842+
# Handle access errors
843+
except (ValueError, TypeError) as err:
844+
if strict:
845+
raise err
810846

811-
# Try transforming to ordinals
812-
return np.array(u_dt.datetime64_to_ordinal(array.values))
847+
LOGGER.warning(
848+
"Failed to read values of '%s' as dates. Hazard.event_name will be "
849+
"empty strings",
850+
array.name,
851+
)
852+
return np.full(array.shape, "")
813853

814854
def maybe_repeat(values: np.ndarray, times: int) -> np.ndarray:
815855
"""Return the array or repeat a single-valued array
@@ -840,8 +880,12 @@ def maybe_repeat(values: np.ndarray, times: int) -> np.ndarray:
840880
None,
841881
np.ones(num_events),
842882
np.array(range(num_events), dtype=int) + 1,
843-
data[coords["event"]].dt.strftime("%Y-%m-%d").values.flatten().tolist(),
844-
np.array(u_dt.datetime64_to_ordinal(data[coords["event"]].values)),
883+
list(
884+
year_month_day_accessor(
885+
data[coords["event"]], strict=False
886+
).flat
887+
),
888+
date_to_ordinal_accessor(data[coords["event"]], strict=False),
845889
],
846890
# The accessor for the data in the Dataset
847891
accessor=[

climada/hazard/test/test_base_xarray.py

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,33 @@ def test_type_and_unit(self):
162162
self.assertEqual(hazard.haz_type, "TC")
163163
self.assertEqual(hazard.units, "m/s")
164164

165+
def test_event_no_time(self):
166+
"""Test if an event coordinate that is not a time works"""
167+
with xr.open_dataset(self.netcdf_path) as dataset:
168+
size = dataset.sizes["time"]
169+
170+
# Positive integers (interpreted as ordinals)
171+
time = [2, 1]
172+
dataset["time"] = time
173+
hazard = Hazard.from_xarray_raster(dataset, "", "")
174+
self._assert_default_types(hazard)
175+
np.testing.assert_array_equal(
176+
hazard.intensity.toarray(), [[0, 1, 2, 3, 4, 5], [6, 7, 8, 9, 10, 11]]
177+
)
178+
np.testing.assert_array_equal(hazard.date, time)
179+
np.testing.assert_array_equal(hazard.event_name, np.full(size, ""))
180+
181+
# Strings
182+
dataset["time"] = ["a", "b"]
183+
with self.assertLogs("climada.hazard.base", "WARNING") as cm:
184+
hazard = Hazard.from_xarray_raster(dataset, "", "")
185+
np.testing.assert_array_equal(hazard.date, np.ones(size))
186+
np.testing.assert_array_equal(hazard.event_name, np.full(size, ""))
187+
self.assertIn("Failed to read values of 'time' as dates.", cm.output[0])
188+
self.assertIn(
189+
"Failed to read values of 'time' as dates or ordinals.", cm.output[1]
190+
)
191+
165192
def test_data_vars(self):
166193
"""Check handling of data variables"""
167194
with xr.open_dataset(self.netcdf_path) as dataset:
@@ -571,6 +598,7 @@ def test_errors(self):
571598
coordinate_vars=dict(latitude="lalalatitude"),
572599
)
573600

601+
574602
# Execute Tests
575603
if __name__ == "__main__":
576604
TESTS = unittest.TestLoader().loadTestsFromTestCase(TestReadDefaultNetCDF)

0 commit comments

Comments
 (0)