pp-mo · pp-mo · Oct 2, 2025 · Mar 6, 2025 · Mar 25, 2025 · Sep 2, 2025
diff --git a/docs/changelog_fragments/161.doc.rst b/docs/changelog_fragments/161.doc.rst
@@ -0,0 +1 @@
+Added a `userguide page <userdocs/user_guide/utilities.html>`_ summarising all the utility features in :mod:`ncdata.utils`.
diff --git a/docs/changelog_fragments/68.feat.rst b/docs/changelog_fragments/68.feat.rst
@@ -0,0 +1 @@
+Added utilities to extract sub-regions by indexing on dimensions: :func:`~ncdata.utils.index_by_dimensions` and :class:`~ncdata.utils.Slicer`.
diff --git a/docs/userdocs/user_guide/common_operations.rst b/docs/userdocs/user_guide/common_operations.rst
@@ -73,6 +73,8 @@ Example :
     The utility function :func:`~ncdata.utils.rename_dimension` is provided for this.
     See : :ref:`howto_rename_dimension`.
 
+.. _copy_notes:
+
 Copying
 -------
 All core objects support a ``.copy()`` method.  See for instance
@@ -132,6 +134,7 @@ comprehensive and may be very costly for instance comparing large data arrays, b
 also allow more nuanced and controllable checking, e.g. to skip data array comparisons
 or ignore variable ordering.
 
+.. _object_creation:
 
 Object Creation
 ---------------

diff --git a/docs/userdocs/user_guide/howtos.rst b/docs/userdocs/user_guide/howtos.rst
@@ -288,7 +288,7 @@ attribute already exists or not.
 .. Note::
 
     Assigning attributes when *creating* a dataset, variable or group is somewhat
-    simpler, discussed :ref:`here <todo>`.
+    simpler, discussed :ref:`here <object_creation>`.
 
 
 .. _howto_create_variable:
@@ -356,6 +356,53 @@ It can be freely overwritten by the user.
     valid dimensions, and that ``.data`` arrays match the dimensions.
 
 
+.. howto_copy:
+
+Make a copy of data
+-------------------
+Use the :meth:`ncdata.NcData.copy` method to make a copy.
+
+.. testsetup::
+
+    >>> from ncdata.utils import dataset_differences
+
+.. doctest::
+
+    >>> data2 = data.copy()
+    >>> assert dataset_differences(data, data2) == []
+
+Note that this creates all-new independent ncdata objects, but all variable data arrays
+will be linked to the originals (to avoid making copies).
+
+See: :ref:`copy_notes`
+
+.. howto_slice:
+
+Extract a subsection by indexing
+--------------------------------
+The neatest way is usually to use a :class:`~ncdata.utils.Slicer`.
+
+.. testsetup::
+
+    >>> from ncdata import NcData, NcDimension
+    >>> from ncdata.utils import Slicer
+    >>> full_data = NcData(dimensions=[NcDimension("time", 5), NcDimension("level", 6), NcDimension("z", 3)])
+    >>> for nn, dim in full_data.dimensions.items():
+    ...    full_data.variables.add(NcVariable(nn, dimensions=[nn], data=np.arange(dim.size)))
+
+.. doctest::
+
+    >>> slice_TLZ = Slicer(full_data, ["time", "level", "z"])
+    >>> data_region = slice_TLZ[:3, 1::2, 2]
+
+.. doctest::
+
+    >>> print({nn: full_data.variables[nn].data for nn in full_data.dimensions})
+    {'time': array([0, 1, 2, 3, 4]), 'level': array([0, 1, 2, 3, 4, 5]), 'z': array([0, 1, 2])}
+    >>> print({nn: data_region.variables[nn].data for nn in data_region.dimensions})
+    {'time': array([0, 1, 2]), 'level': array([1, 3, 5])}
+
+
 Read data from a NetCDF file
 ----------------------------
 Use the :func:`ncdata.netcdf4.from_nc4` function to load a dataset from a netCDF file.

diff --git a/docs/userdocs/user_guide/user_guide.rst b/docs/userdocs/user_guide/user_guide.rst
@@ -9,5 +9,6 @@ Detailed explanations, beyond the basic tutorial-style introductions
     design_principles
     data_objects
     common_operations
+    utilities
     general_topics
     howtos
diff --git a/docs/userdocs/user_guide/utilities.rst b/docs/userdocs/user_guide/utilities.rst
@@ -0,0 +1,148 @@
+Utilities and Conveniences
+==========================
+This section provide a short overview of various more involved operations which are
+provided in the :mod:`~ncdata.utils` module.  In all cases, more detail is available in
+the `API pages <../../details/api/ncdata.utils.html>`_
+
+Rename Dimensions
+-----------------
+The :func:`~ncdata.utils.rename_dimension` utility does this, in a way which ensures a
+safe and consistent result.
+
+Dataset Equality Testing
+------------------------
+The function :func:`~ncdata.utils.dataset_differences` produces a list of messages
+detailing all the ways in which two datasets are different.
+
+For Example:
+^^^^^^^^^^^^
+.. testsetup::
+
+    >>> from ncdata import NcData, NcDimension, NcVariable
+    >>> from ncdata.utils import dataset_differences
+    >>> import numpy as np
+
+.. doctest::
+
+    >>> data1 = NcData(
+    ...   dimensions=[NcDimension("x", 5)],
+    ...   variables=[NcVariable("vx", dimensions=["x"], data=np.arange(5))]
+    ... )
+    >>> data2 = data1.copy()
+    >>> print(dataset_differences(data1, data2))
+    []
+
+.. doctest::
+
+    >>> data2.dimensions["x"].unlimited = True
+    >>> data2.variables["vx"].data = np.array([1, 3])  # NB must be a *new* array !
+
+.. doctest::
+
+    >>> diffs = dataset_differences(data1, data2)
+    >>> for msg in diffs:
+    ...    print(msg)
+    Dataset "x" dimension has different "unlimited" status : False != True
+    Dataset variable "vx" shapes differ : (5,) != (2,)
+
+.. note::
+   To compare isolated variables, a subsidiary routine
+   :func:`~ncdata.utils.variable_differences` is also provided.
+
+Sub-indexing
+------------
+A new dataset can be derived by indexing over dimensions, analagous to sub-indexing
+an array.  This operation indexes all the variables appropriately, to produce a new
+independent dataset which is complete and self-consistent.
+
+The function :func:`~ncdata.utils.index_by_dimensions` provides indexing where the
+indices are passed as arguments or keywords for the specific dimensions.
+
+For example:
+
+.. testsetup::
+
+    >>> from ncdata.utils import index_by_dimensions
+
+.. doctest::
+
+    >>> data = NcData(
+    ...   dimensions=[NcDimension("y", 4), NcDimension("x", 10)],
+    ...   variables=[NcVariable(
+    ...      "v1", dimensions=["y", "x"],
+    ...      data=np.arange(40).reshape((4, 10))
+    ...   )]
+    ... )
+
+.. doctest::
+
+    >>> subdata = index_by_dimensions(data, y=2, x=slice(None, 4))
+    >>> print(subdata)
+    <NcData: <'no-name'>
+        dimensions:
+            x = 4
+    <BLANKLINE>
+        variables:
+            <NcVariable(int64): v1(x)>
+    >
+    >>> print(subdata.variables["v1"].data)
+    [20 21 22 23]
+
+Slicing syntax
+^^^^^^^^^^^^^^
+The :class:`~ncdata.utils.Slicer` class is provided to enable the same operation to be
+expressed using multi-dimensional slicing syntax.
+
+So for example, the above is more neatly expressed like this ...
+
+.. testsetup::
+
+    >>> from ncdata.utils import Slicer
+
+.. doctest::
+
+    >>> data_slicer = Slicer(data, ["y", "x"])
+    >>> subdata2 = data_slicer[2, :4]
+
+.. doctest::
+
+    >>> dataset_differences(subdata, subdata2) == []
+    True
+
+
+Consistency Checking
+--------------------
+The :func:`~ncdata.utils.save_errors` function provides a general
+correctness-and-consistency check.
+
+For example:
+
+.. testsetup::
+
+    >>> from ncdata.utils import save_errors
+
+.. doctest::
+
+    >>> data_bad = data.copy()
+    >>> array = data_bad.variables["v1"].data
+    >>> data_bad.variables["v1"].data = array[:2]
+    >>> data_bad.variables.add(NcVariable("q", data={"x": 4}))
+
+.. doctest::
+
+    >>> for msg in save_errors(data_bad):
+    ...    print(msg)
+    Variable 'v1' data shape = (2, 10), does not match that of its dimensions = (4, 10).
+    Variable 'q' has a dtype which cannot be saved to netcdf : dtype('O').
+
+
+See : :ref:`correctness-checks`
+
+
+Data Copying
+------------
+The :func:`~ncdata.utils.ncdata_copy` makes structural copies of datasets.
+However, this can be easily be accessed as :meth:`ncdata.NcData.copy`, which is the same
+operation.
+
+See: :ref:`copy_notes`
diff --git a/lib/ncdata/utils/__init__.py b/lib/ncdata/utils/__init__.py
@@ -2,11 +2,14 @@
 
 from ._compare_nc_datasets import dataset_differences, variable_differences
 from ._copy import ncdata_copy
+from ._dim_indexing import Slicer, index_by_dimensions
 from ._rename_dim import rename_dimension
 from ._save_errors import save_errors
 
 __all__ = [
+    "Slicer",
     "dataset_differences",
+    "index_by_dimensions",
     "ncdata_copy",
     "rename_dimension",
     "save_errors",
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		Added a `userguide page <userdocs/user_guide/utilities.html>`_ summarising all the utility features in :mod:`ncdata.utils`.
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		Added utilities to extract sub-regions by indexing on dimensions: :func:`~ncdata.utils.index_by_dimensions` and :class:`~ncdata.utils.Slicer`.