Skip to content

Commit d24e7b9

Browse files
Dim slicer (#120)
* Initial dim slicing: WIP no groups handling, Slicer untested as yet. * Start of indexer testing (WIP: incomplete). * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Small improvements. * Generalise testing + extend to Slicer tests. * More tests; small fixes; more docs and api-docs. whoops tweak * Add whatsnew. * Add whastnew for the new utilities page. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Define .slicer for Ncdata; support full == checking on datasets and variables. * Simplify usage modes and API of indexing utilities. * Fix tests for simplified indexing features. * Add tests for new core object methods. * Fix docstrings + doctests. * Replace dataset difference with equality tests in examples. * Document relation between equality testing and difference utilities. * Added whatsnew for dataset/variable equality support. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Don't keep fragments when building html. * Improved indexing whatsnew. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent 76d792c commit d24e7b9

File tree

16 files changed

+984
-21
lines changed

16 files changed

+984
-21
lines changed

docs/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ allapi:
1919
sphinx-apidoc -Mfe -o ./details/api ../lib/ncdata
2020

2121
towncrier:
22-
towncrier build --keep
22+
towncrier build --yes
2323

2424

2525
# Tweaked "make html", which restores the changelog state after docs build.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Added a `userguide page <userdocs/user_guide/utilities.html>`_ summarising all the utility features in :mod:`ncdata.utils`.
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Provide exact == and != for datasets and variables, by just calling the difference utilities.
2+
This can be inefficient, but is simple to understand and generally useful.
3+
See: :ref:`equality_testing`
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
Added the ability to extract a sub-region by indexing/slicing over dimensions.
2+
The :class:`ncdata.NcData` objects can be indexed with the ``[]`` operation, or over
3+
specifed dimensions with the :meth:`~ncdata.NcData.slicer` method.
4+
This is based on the new :meth:`~ncdata.utils.index_by_dimensions()` utility method
5+
and :class:`~ncdata.utils.Slicer` class.
6+
See: :ref:`indexing_overview`

docs/userdocs/user_guide/common_operations.rst

Lines changed: 18 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,8 @@ Example :
7373
The utility function :func:`~ncdata.utils.rename_dimension` is provided for this.
7474
See : :ref:`howto_rename_dimension`.
7575

76+
.. _copy_notes:
77+
7678
Copying
7779
-------
7880
All core objects support a ``.copy()`` method. See for instance
@@ -115,23 +117,24 @@ For real data, this is just ``var.data = var.data.copy()``.
115117
There is also a utility function :func:`ncdata.utils.ncdata_copy` : This is
116118
effectively the same thing as the NcData object :meth:`~ncdata.NcData.copy` method.
117119

120+
.. _equality_testing:
121+
122+
Equality Testing
123+
----------------
124+
We implement equality operations ``==`` / ``!=`` for all the core data objects.
118125

119-
Equality Checking
120-
-----------------
121-
We provide a simple, comprehensive ``==`` check for :mod:`~ncdata.NcDimension` and
122-
:mod:`~ncdata.NcAttribute` objects, but not at present :mod:`~ncdata.NcVariable` or
123-
:mod:`~ncdata.NcData`.
126+
However, simple equality testing on :class:`@ncdata.NcData` and :class:`@ncdata.NcVariable`
127+
objects can be very costly if it requires comparing large data arrays.
124128

125-
So, using ``==`` on :mod:`~ncdata.NcVariable` or :mod:`~ncdata.NcData` objects
126-
will only do an identity check -- that is, it tests ``id(A) == id(B)``, or ``A is B``.
129+
If you need to avoid comparing large (and possibly lazy) arrays then you can use the
130+
:func:`ncdata.utils.dataset_differences` and
131+
:func:`ncdata.utils.variable_differences` utility functions.
132+
These functions also provide multiple options to enable more tolerant comparison,
133+
such as allowing variables to have a different ordering.
127134

128-
However, these objects **can** be properly compared with the dataset comparison
129-
utilities, :func:`ncdata.utils.dataset_differences` and
130-
:func:`ncdata.utils.variable_differences`. By default, these operations are very
131-
comprehensive and may be very costly for instance comparing large data arrays, but they
132-
also allow more nuanced and controllable checking, e.g. to skip data array comparisons
133-
or ignore variable ordering.
135+
See: :ref:`utils_equality`
134136

137+
.. _object_creation:
135138

136139
Object Creation
137140
---------------
@@ -184,8 +187,7 @@ The result is the same:
184187

185188
.. doctest:: python
186189

187-
>>> from ncdata.utils import dataset_differences
188-
>>> print(dataset_differences(data1, data2))
189-
[]
190+
>>> data1 == data2
191+
True
190192

191193

docs/userdocs/user_guide/howtos.rst

Lines changed: 62 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -288,7 +288,7 @@ attribute already exists or not.
288288
.. Note::
289289

290290
Assigning attributes when *creating* a dataset, variable or group is somewhat
291-
simpler, discussed :ref:`here <todo>`.
291+
simpler, discussed :ref:`here <object_creation>`.
292292

293293

294294
.. _howto_create_variable:
@@ -356,6 +356,66 @@ It can be freely overwritten by the user.
356356
valid dimensions, and that ``.data`` arrays match the dimensions.
357357

358358

359+
.. _howto_copy:
360+
361+
Make a copy of data
362+
-------------------
363+
Use the :meth:`ncdata.NcData.copy` method to make a copy.
364+
365+
.. doctest::
366+
367+
>>> data2 = data.copy()
368+
>>> data == data2
369+
True
370+
371+
Note that this creates all-new independent ncdata objects, but all variable data arrays
372+
will be linked to the originals (to avoid making copies).
373+
374+
See: :ref:`copy_notes`
375+
376+
.. _howto_slice:
377+
378+
Extract a subsection by indexing
379+
--------------------------------
380+
The nicest way is usually just to use the :meth:`~ncdata.Ncdata.slicer` method to specify
381+
dimensions to index, and then index the result.
382+
383+
.. testsetup::
384+
385+
>>> from ncdata import NcData, NcDimension
386+
>>> from ncdata.utils import Slicer
387+
>>> full_data = NcData(dimensions=[NcDimension("x", 7), NcDimension("y", 6)])
388+
>>> for nn, dim in full_data.dimensions.items():
389+
... full_data.variables.add(NcVariable(nn, dimensions=[nn], data=np.arange(dim.size)))
390+
391+
.. doctest::
392+
393+
>>> for dimname in full_data.dimensions:
394+
... print(dimname, ':', full_data.variables[dimname].data)
395+
x : [0 1 2 3 4 5 6]
396+
y : [0 1 2 3 4 5]
397+
398+
.. doctest::
399+
400+
>>> data_region = full_data.slicer("y", "x")[3, 1::2]
401+
402+
.. doctest::
403+
404+
>>> for dimname in data_region.dimensions:
405+
... print(dimname, ':', data_region.variables[dimname].data)
406+
x : [1 3 5]
407+
408+
You can also slice data directly, which simply acts on the dimensions in order:
409+
410+
.. doctest::
411+
412+
>>> data_region_2 = full_data[1::2, 3]
413+
>>> data_region_2 == data_region
414+
True
415+
416+
See: :ref:`indexing_overview`
417+
418+
359419
Read data from a NetCDF file
360420
----------------------------
361421
Use the :func:`ncdata.netcdf4.from_nc4` function to load a dataset from a netCDF file.
@@ -658,8 +718,7 @@ In fact, there should be NO difference between these two.
658718

659719
.. doctest:: python
660720

661-
>>> from ncdata.utils import dataset_differences
662-
>>> print(dataset_differences(data, data2) == [])
721+
>>> data == data2
663722
True
664723

665724

docs/userdocs/user_guide/user_guide.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,5 +9,6 @@ Detailed explanations, beyond the basic tutorial-style introductions
99
design_principles
1010
data_objects
1111
common_operations
12+
utilities
1213
general_topics
1314
howtos

0 commit comments

Comments
 (0)