Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 111 additions & 0 deletions docs/src/further_topics/dataless_cubes.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
.. _dataless-cubes:

==============
Dataless Cubes
==============
It is possible for a cube to exist without a data payload.
In this case ``cube.data`` is ``None``, instead of containing an array (real or lazy) as
usual.

This can be useful when the cube is used purely as a placeholder for metadata, e.g. to
represent a combination of coordinates.

Most notably, dataless cubes can be used as the target "grid cube" for most regridding
schemes, since in that case the cube's coordinates are all that the method uses.
See also :meth:`iris.util.make_gridcube`.


Properties of dataless cubes
----------------------------

* ``cube.shape`` is unchanged
* ``cube.data`` == ``None``
* ``cube.dtype`` == ``None``
* ``cube.core_data()`` == ``cube.lazy_data()`` == ``None``
* ``cube.is_dataless()`` == ``True``
* ``cube.has_lazy_data()`` == ``False``


Cube creation
-------------
You can create a dataless cube with the :meth:`~iris.cube.Cube` constructor
(i.e. ``__init__`` call), by specifying the ``shape`` keyword in place of ``data``.
If both are specified, an error is raised (even if data and shape are compatible).


Data assignment
---------------
You can make an existing cube dataless, by setting ``cube.data = None``.
The data array is simply discarded.

Likewise, you can add data by assigning any data array of the correct shape, which
turns it into a 'normal' cube.

Note that ``cube.dtype`` always matches ``cube.data.dtype``. A dataless cube has a
dtype of ``None``.


Cube copy
---------
The syntax that allows you to replace data on copying,
e.g. ``cube2 = cube.copy(new_data)``, has been extended to accept the special value
:data:`iris.DATALESS`.

So, ``cube2 = cube.copy(iris.DATALESS)`` makes ``cube2`` a
dataless copy of ``cube``.
This is equivalent to ``cube2 = cube.copy(); cube2.data = None``.


Save and Load
-------------
The netcdf file interface can save and re-load dataless cubes correctly.
See: :ref:`save_load_dataless`.

.. _dataless_merge:

Merging
-------
Merging is fully supported for dataless cubes, including combining them with "normal"
cubes.

* in all cases, the result has the same shape and metadata as if the same cubes had
data.
* Merging multiple dataless cubes produces a dataless result.
* Merging dataless and non-dataless cubes results in a partially 'missing' data array,
i.e. the relevant sections are filled with masked data.
* Laziness is also preserved.


Operations NOT supported
-------------------------
Dataless cubes are relatively new, and only partly integrated with Iris cube operations
generally.

The following are some of the notable features which do *not* support dataless cubes,
at least as yet :

* plotting

* cube arithmetic

* statistics

* concatenation

* :meth:`iris.cube.CubeList.realise_data`

* various :class:`~iris.cube.Cube` methods, including at least:

* :meth:`~iris.cube.Cube.convert_units`

* :meth:`~iris.cube.Cube.subset`

* :meth:`~iris.cube.Cube.intersection`

* :meth:`~iris.cube.Cube.slices`

* :meth:`~iris.cube.Cube.interpolate`

* :meth:`~iris.cube.Cube.regrid`
Note: in this case the target ``grid`` can be dataless, but not the source
(``self``) cube.
1 change: 1 addition & 0 deletions docs/src/further_topics/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ Extra information on specific technical issues.
lenient_maths
um_files_loading
missing_data_handling
dataless_cubes
netcdf_io
dask_best_practices/index
ugrid/index
Expand Down
5 changes: 3 additions & 2 deletions docs/src/further_topics/netcdf_io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -188,9 +188,10 @@ Deferred Saving

TBC

.. _save_load_dataless:

Dataless Cubes
--------------
Dataless Cubes in NetCDF files
------------------------------
It now possible to have "dataless" cubes, where ``cube.data is None``.
When these are saved to a NetCDF file interface, this results in a netcdf file variable
with all-unwritten data (meaning that it takes up no storage space).
Expand Down
11 changes: 8 additions & 3 deletions docs/src/whatsnew/latest.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,9 @@ This document explains the changes made to Iris for this release
✨ Features
===========

#. `@pp-mo`_ added a new utility function for making a test cube with a specified 2D
horizontal grid.
(:issue:`5770`, :pull:`6581`)
#. `@pp-mo`_ added the :func:`~iris.util.make_gridcube` utility function, for making a
dataless test-cube with a specified 2D horizontal grid.
(:issue:`5770`, :pull:`6581`, :pull:`6741`)

#. `@bjlittle`_ extended ``zlib`` compression of :class:`~iris.cube.Cube` data
payload when saving to NetCDF to also include any attached `CF-UGRID`_
Expand All @@ -53,6 +53,11 @@ This document explains the changes made to Iris for this release
:func:`~iris.cube.Cube.slices` to work with dataless cubes.
(:issue:`6725`, :pull:`6724`)

#. `@pp-mo`_ added the ability to merge dataless cubes. This also means they can be
re-loaded normally with :meth:`iris.load`. See: :ref:`dataless_merge`.
Also added a new documentation section on dataless cubes.
(:issue:`6740`, :pull:`6741`)


🐛 Bugs Fixed
=============
Expand Down
16 changes: 10 additions & 6 deletions lib/iris/_data_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,16 @@ def __init__(self, data, shape=None):
dataless.
"""
if (shape is None) and (data is None):
msg = 'one of "shape" or "data" should be provided; both are None'
raise ValueError(msg)
elif (shape is not None) and (data is not None):
msg = '"shape" should only be provided if "data" is None'
raise ValueError(msg)
if shape is None:
if data is None:
msg = 'one of "shape" or "data" should be provided; both are None'
raise ValueError(msg)
else:
if data is not None:
msg = '"shape" should only be provided if "data" is None'
raise ValueError(msg)
# Normalise how shape is recorded
shape = tuple(shape)

# Initialise the instance.
self._shape = shape
Expand Down
72 changes: 56 additions & 16 deletions lib/iris/_merge.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from collections import OrderedDict, namedtuple
from copy import deepcopy

import dask.array as da
import numpy as np

from iris._lazy_data import (
Expand Down Expand Up @@ -430,7 +431,13 @@ def match(self, other, error_on_mismatch):
if self.data_shape != other.data_shape:
msg = "cube.shape differs: {} != {}"
msgs.append(msg.format(self.data_shape, other.data_shape))
if self.data_type != other.data_type:
if (
self.data_type is not None
and other.data_type is not None
and self.data_type != other.data_type
):
# N.B. allow "None" to match any other dtype: this means that dataless
# cubes can merge with 'dataful' ones.
msg = "cube data dtype differs: {} != {}"
msgs.append(msg.format(self.data_type, other.data_type))
# Both cell_measures_and_dims and ancillary_variables_and_dims are
Expand Down Expand Up @@ -1109,8 +1116,6 @@ def __init__(self, cube):
source-cube.
"""
if cube.is_dataless():
raise iris.exceptions.DatalessError("merge")
# Default hint ordering for candidate dimension coordinates.
self._hints = [
"time",
Expand Down Expand Up @@ -1240,7 +1245,10 @@ def merge(self, unique=True):
# their data loaded then at the end we convert the stack back
# into a plain numpy array.
stack = np.empty(self._stack_shape, "object")
all_have_data = True
all_have_real_data = True
some_are_dataless = False
part_shape: tuple = None
part_dtype: np.dtype = None
for nd_index in nd_indexes:
# Get the data of the current existing or last known
# good source-cube
Expand All @@ -1249,18 +1257,45 @@ def merge(self, unique=True):
data = self._skeletons[group[offset]].data
# Ensure the data is represented as a dask array and
# slot that array into the stack.
if is_lazy_data(data):
all_have_data = False
if data is None:
some_are_dataless = True
else:
data = as_lazy_data(data)
# We have (at least one) array content : Record the shape+dtype
part_shape = data.shape
part_dtype = data.dtype
# ensure lazy (we make the result real, later, if all were real)
if is_lazy_data(data):
all_have_real_data = False
else:
data = as_lazy_data(data)
stack[nd_index] = data

merged_data = multidim_lazy_stack(stack)
if all_have_data:
# All inputs were concrete, so turn the result back into a
# normal array.
merged_data = as_concrete_data(merged_data)
merged_cube = self._get_cube(merged_data)
if part_shape is None:
# NO parts had data : the result will also be dataless
merged_data = None
merged_shape = self._shape
else:
# At least some inputs had data : the result will have a data array.
if some_are_dataless:
# Some parts were dataless: fill these with a lazy all-missing array.
missing_part = da.ma.masked_array(
data=da.zeros(part_shape, dtype=part_dtype),
mask=da.ones(part_shape, dtype=bool),
dtype=part_dtype,
)
for inds in np.ndindex(stack.shape):
if stack[inds] is None:
stack[inds] = missing_part

# Make a single lazy merged result array
merged_data = multidim_lazy_stack(stack)
merged_shape = None
if all_have_real_data:
# All inputs were concrete, so turn the result back into a
# normal array.
merged_data = as_concrete_data(merged_data)

merged_cube = self._get_cube(merged_data, shape=merged_shape)
merged_cubes.append(merged_cube)

return merged_cubes
Expand Down Expand Up @@ -1291,8 +1326,6 @@ def register(self, cube, error_on_mismatch=False):
this :class:`ProtoCube`.
"""
if cube.is_dataless():
raise iris.exceptions.DatalessError("merge")
cube_signature = self._cube_signature
other = self._build_signature(cube)
match = cube_signature.match(other, error_on_mismatch)
Expand Down Expand Up @@ -1545,12 +1578,18 @@ def name_in_independents():
# deferred loading, this does NOT change the shape.
self._shape.extend(signature.data_shape)

def _get_cube(self, data):
def _get_cube(self, data, shape=None):
"""Generate fully constructed cube.
Return a fully constructed cube for the given data, containing
all its coordinates and metadata.
Parameters
----------
data : array_like
Cube data content. If None, `shape` must be set and the result is dataless.
shape : tuple, optional
Cube data shape, only used if data is None.
"""
signature = self._cube_signature
dim_coords_and_dims = [
Expand All @@ -1573,6 +1612,7 @@ def _get_cube(self, data):
aux_coords_and_dims=aux_coords_and_dims,
cell_measures_and_dims=cms_and_dims,
ancillary_variables_and_dims=avs_and_dims,
shape=shape,
**kwargs,
)

Expand Down
2 changes: 1 addition & 1 deletion lib/iris/cube.py
Original file line number Diff line number Diff line change
Expand Up @@ -5160,7 +5160,7 @@ def interpolate(
"""
if self.is_dataless():
raise iris.exceptions.DatalessError("interoplate")
raise iris.exceptions.DatalessError("interpolate")
coords, points = zip(*sample_points)
interp = scheme.interpolator(self, coords) # type: ignore[arg-type]
return interp(points, collapse_scalar=collapse_scalar)
Expand Down
2 changes: 1 addition & 1 deletion lib/iris/fileformats/pp.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@
"save_pairs_from_cube",
]


#: Standard spherical earth radius, as defined for MetOffice Unified Model.
EARTH_RADIUS = 6371229.0


Expand Down
Loading
Loading