Skip to content

Commit 81a0c68

Browse files
committed
Initial Sphinx documentation content. (#76)
* Initial Sphinx documentation content. * Improved Sphinx config. * Update README. * Create placeholders for extended User Guide content. * Reference docs pages. * Fix User Guide page. * Fix some links. * Tie off broken link.
1 parent 4ef8f0c commit 81a0c68

28 files changed

+969
-216
lines changed

.pre-commit-config.yaml

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -59,12 +59,6 @@ repos:
5959
- id: blacken-docs
6060
types: [file, rst]
6161

62-
- repo: https://github.com/aio-libs/sort-all
63-
rev: v1.2.0
64-
hooks:
65-
- id: sort-all
66-
types: [file, python]
67-
6862
- repo: https://github.com/pycqa/pydocstyle
6963
rev: 6.3.0
7064
hooks:

.readthedocs.yml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,23 @@ build:
44
os: ubuntu-20.04
55
tools:
66
python: mambaforge-4.10
7+
78
jobs:
9+
# Content here largely copied from Iris
10+
# see : https://github.com/SciTools/iris/pull/4855
11+
post_checkout:
12+
# The SciTools/iris repository is shallow i.e., has a .git/shallow,
13+
# therefore complete the repository with a full history in order
14+
# to allow setuptools-scm to correctly auto-discover the version.
15+
- git fetch --unshallow
16+
- git fetch --all
17+
# Need to stash the local changes that Read the Docs makes so that
18+
# setuptools_scm can generate the correct Iris version.
19+
pre_install:
20+
- git stash
821
post_install:
922
- sphinx-apidoc -Mfe -o ./docs/api ./lib/ncdata
23+
- git stash pop
1024

1125
conda:
1226
environment: requirements/readthedocs.yml

README.md

Lines changed: 71 additions & 183 deletions
Original file line numberDiff line numberDiff line change
@@ -22,218 +22,106 @@ This enables the user to freely mix+match operations from both projects, getting
2222
> temp_cube = cubes.extract_cube("air_temperature")
2323
> qplt.contourf(temp_cube[0])
2424
25-
## Contents
26-
* [Motivation](#motivation)
27-
* [Primary Use](#primary-use)
28-
* [Secondary Uses](#secondary-uses)
29-
* [Principles](#principles)
30-
* [Working Usage Examples](#code-examples)
31-
* [API documentation](#api-documentation)
32-
* [Installation](#installation)
33-
* [Project Status](#project-status)
34-
* [Change Notes](#change-notes)
35-
* [Code stability](#code-stability)
36-
* [Iris and Xarray version compatibility](#iris-and-xarray-compatibility)
37-
* [Current Limitations](#known-limitations)
38-
* [Known Problems](#known-problems)
39-
* [References](#references)
40-
* [Developer Notes](#developer-notes)
41-
42-
# Motivation
43-
## Primary Use
44-
Fast and efficient translation of data between Xarray and Iris objects.
45-
46-
This allows the user to mix+match features from either package in code.
47-
48-
For example:
25+
# Purposes
26+
* represent netcdf data as structures of Python objects
27+
* easy manipulation of netcdf data with pythonic syntax
28+
* Fast and efficient translation of data between Xarray and Iris objects.
29+
* This allows the user to mix+match features from either package in code.
30+
31+
See : https://ncdata.readthedocs.io/en/latest/userdocs/user_guide/design_principles.html
32+
33+
# Documentation
34+
On ReadTheDocs. Please see:
35+
* [stable](https://ncdata.readthedocs.io/en/stable/index.html)
36+
* [latest](https://ncdata.readthedocs.io/en/latest/index.html)
37+
38+
# Demonstration code examples:
39+
* [Apply Iris regrid to xarray data](#apply-iris-regrid-to-xarray-data)
40+
* [Use Zarr data in Iris](#use-zarr-data-in-iris)
41+
* [Correct a mis-coded attribute in Iris input](#correct-a-miscoded-attribute-in-iris-input)
42+
* [Rename a dimension in xarray output](#rename-a-dimension-in-xarray-output)
43+
* [Copy selected data to a new file](#copy-selected-data-to-a-new-file)
44+
45+
## Apply Iris regrid to xarray data
4946
``` python
5047
from ncdata.iris_xarray import cubes_to_xarray, cubes_from_xarray
51-
52-
# Apply Iris regridder to xarray data
5348
dataset = xarray.open_dataset("file1.nc", chunks="auto")
5449
(cube,) = cubes_from_xarray(dataset)
5550
cube2 = cube.regrid(grid_cube, iris.analysis.PointInCell)
5651
dataset2 = cubes_to_xarray(cube2)
52+
```
5753

58-
# Apply Xarray statistic to Iris data
59-
cubes = iris.load("file1.nc")
60-
dataset = cubes_to_xarray(cubes)
61-
dataset2 = dataset.group_by("time.dayofyear").argmin()
62-
cubes2 = cubes_from_xarray(dataset2)
54+
## Use Zarr data in Iris
55+
``` python
56+
from ncdata.threadlock_sharing import enable_lockshare
57+
enable_lockshare(iris=True, xarray=True)
58+
import xarray as xr
59+
dataset = xr.open_dataset(input_zarr_path, engine="zarr", chunks="auto")
60+
input_cubes = cubes_from_xarray(dataset)
61+
output_cubes = my_process(input_cubes)
62+
dataset2 = cubes_to_xarray(output_cubes)
63+
dataset2.to_zarr(output_zarr_path)
6364
```
64-
* data conversion is equivalent to writing to a file with one library, and reading it
65-
back with the other ..
66-
* .. except that no actual files are written
67-
* both real (numpy) and lazy (dask) variable data arrays are transferred directly,
68-
without copying or computing
69-
70-
71-
## Secondary Uses
72-
### Exact control of file formatting
73-
Ncdata can also be used as a transfer layer between Iris or Xarray file i/o and the
74-
exact format of data stored in files.
75-
I.E. adjustments can be made to file data before loading it into Iris/Xarray; or
76-
Iris/Xarray saved output can be adjusted before writing to a file.
77-
78-
This allows the user to workaround any package limitations in controlling storage
79-
aspects such as : data chunking; reserved attributes; missing-value processing; or
80-
dimension control.
8165

82-
For example:
66+
## Correct a miscoded attribute in Iris input
8367
``` python
84-
from ncdata.xarray import from_xarray
8568
from ncdata.iris import to_iris
86-
from ncdata.netcdf4 import to_nc4, from_nc4
69+
enable_lockshare(iris=True)
70+
ncdata = from_nc4(input_path)
71+
for var in ncdata.variables.values():
72+
if "coords" in var.attributes:
73+
var.attributes.rename("coords", "coordinates")
74+
cubes = to_iris(ncdata)
75+
```
8776

88-
# Rename a dimension in xarray output
77+
## Rename a dimension in xarray output
78+
``` python
79+
enable_lockshare(xarray=True)
8980
dataset = xr.open_dataset("file1.nc")
9081
xr_ncdata = from_xarray(dataset)
91-
dim = xr_ncdata.dimensions.pop("dim0")
92-
dim.name = "newdim"
93-
xr_ncdata.dimensions["newdim"] = dim
82+
xr_ncdata.dimensions.rename("dim0", "newdim")
83+
# N.B. must also replace the name in dimension-lists of variables
9484
for var in xr_ncdata.variables.values():
9585
var.dimensions = ["newdim" if dim == "dim0" else dim for dim in var.dimensions]
9686
to_nc4(ncdata, "file_2a.nc")
97-
98-
# Fix chunking in Iris input
99-
ncdata = from_nc4("file1.nc")
100-
for var in ncdata.variables:
101-
# custom chunking() mimics the file chunks we want
102-
var.chunking = lambda: (100.0e6 if dim == "dim0" else -1 for dim in var.dimensions)
103-
cubes = to_iris(ncdata)
104-
```
105-
106-
### Manipulation of data
107-
ncdata can also be used for data extraction and modification, similar to the scope of
108-
CDO and NCO command-line operators but without file operations.
109-
However, this type of usage is as yet still undeveloped : There is no inbuilt support
110-
for data consistency checking, or obviously useful operations such as indexing by
111-
dimension.
112-
This could be added in future, but it is also true that many such operations (like
113-
indexing) may be better done using Iris/Xarray.
114-
115-
116-
# Principles
117-
* ncdata represents NetCDF data as Python objects
118-
* ncdata objects can be freely manipulated, independent of any data file
119-
* ncdata variables can contain either real (numpy) or lazy (Dask) arrays
120-
* ncdata can be losslessly converted to and from actual NetCDF files
121-
* Iris or Xarray objects can be converted to and from ncdata, in the same way that
122-
they are read from and saved to NetCDF files
123-
* **_translation_** between Xarray and Iris is based on conversion to ncdata, which
124-
is in turn equivalent to file i/o
125-
* thus, Iris/Xarray translation is equivalent to _saving_ from one
126-
package into a file, then _loading_ the file in the other package
127-
* ncdata exchanges variable data directly with Iris/Xarray, with no copying of real
128-
data or computing of lazy data
129-
* ncdata exchanges lazy arrays with files using Dask 'streaming', thus allowing
130-
transfer of arrays larger than memory
131-
132-
133-
# Code Examples
134-
* mostly TBD
135-
* proof-of-concept script for
136-
[netCDF4 file i/o](https://github.com/pp-mo/ncdata/blob/main/tests/integration/example_scripts/ex_ncdata_netcdf_conversion.py)
137-
* proof-of-concept script for
138-
[iris-xarray conversions](https://github.com/pp-mo/ncdata/blob/main/tests/integration/example_scripts/ex_iris_xarray_conversion.py)
139-
140-
141-
# API documentation
142-
* see the [ReadTheDocs build](https://ncdata.readthedocs.io/en/latest/index.html)
143-
144-
145-
# Installation
146-
Install from conda-forge with conda
147-
```
148-
conda install -c conda-forge ncdata
149-
```
150-
151-
Or from PyPI with pip
152-
```
153-
pip install ncdata
15487
```
15588

156-
# Project Status
157-
158-
## Code Stability
159-
We intend to follow [PEP 440](https://peps.python.org/pep-0440/) or (older) [SemVer](https://semver.org/) versioning principles.
160-
161-
Minor release version is at **"v0.1"**.
162-
This is a first complete implementation, with functional operational of all public APIs.
163-
164-
The code is however still experimental, and APIs are not stable (hence no major version yet).
89+
## Copy selected data to a new file
90+
``` python
91+
from ncdata.netcdf4 import from_nc4, to_nc4
92+
ncdata = from_nc4("file1.nc")
16593

166-
## Change Notes
167-
### v0.1.1
168-
Small tweaks + bug fixes.
169-
**Note:** [#62](https://github.com/pp-mo/ncdata/pull/62) and [#59](https://github.com/pp-mo/ncdata/pull/59) are important fixes to achieve intended performance goals,
170-
i.e. moving arbitrarily large data via Dask without running out of memory.
94+
# Make a list of partial names to select the wanted variables
95+
keys = ["air_", "surface"]
17196

172-
* Stop non-numpy attribute values from breaking attribute printout. [#63](https://github.com/pp-mo/ncdata/pull/63)
173-
* Stop ``ncdata.iris.from_iris()`` consuming full data memory for each variable. [#62](https://github.com/pp-mo/ncdata/pull/62)
174-
* Provide convenience APIs for ncdata component dictionaries and attribute values. [#61](https://github.com/pp-mo/ncdata/pull/61)
175-
* Use dask ``chunks="auto"`` in ``ncdata.netcdf4.from_nc4()``. [#59](https://github.com/pp-mo/ncdata/pull/59)
97+
# Explicitly add dimension names, to include all the dimension variables
98+
keys += + list(ncdata.dimensions)
17699

177-
### v0.1.0
178-
First release
100+
# Identify the wanted variables
101+
select_vars = [
102+
var
103+
for var in ncdata.variables.values()
104+
if any(key in var.name for key in keys)
105+
]
179106

180-
## Iris and Xarray Compatibility
181-
* C.I. tests GitHub PRs and merges, against latest releases of Iris and Xarray
182-
* compatible with iris >= v3.7.0
183-
* see : [support added in v3.7.0](https://scitools-iris.readthedocs.io/en/stable/whatsnew/3.7.html#internal)
107+
# Add any referenced coordinate variables
108+
for var in list(select_vars):
109+
var = ncdata.variables[varname]
110+
for coordname in var.attributes.get("coordinates", "").split(" "):
111+
select_vars.append(ncdata.variables[coordname])
184112

185-
## Known limitations
186-
Unsupported features : _not planned_
187-
* user-defined datatypes are not supported
188-
* this includes compound and variable-length types
113+
# Replace variables with only the wanted ones
114+
ncdata.variables.clear()
115+
ncdata.variables.addall(select_vars)
189116

190-
Unsupported features : _planned for future release_
191-
* groups (not yet fully supported ?)
192-
* file output chunking control
117+
# Save
118+
to_nc4(ncdata, "pruned.nc")
119+
```
193120

194-
## Known problems
195-
As-of v0.1.1
196-
* in conversion from iris cubes with [`from_iris`](https://ncdata.readthedocs.io/en/latest/api/ncdata.iris.html#ncdata.iris.from_iris),
197-
use of an `unlimited_dims` key currently causes an exception
198-
* https://github.com/pp-mo/ncdata/issues/43
199-
* in conversion to xarray with [`to_xarray`](https://ncdata.readthedocs.io/en/latest/api/ncdata.xarray.html#ncdata.xarray.to_xarray),
200-
dataset encodings are not reproduced, most notably **the "unlimited_dims" control is missing**
201-
* https://github.com/pp-mo/ncdata/issues/66
202121

203-
# References
122+
# Older References in Iris
204123
* Iris issue : https://github.com/SciTools/iris/issues/4994
205124
* planning presentation : https://github.com/SciTools/iris/files/10499677/Xarray-Iris.bridge.proposal.--.NcData.pdf
206125
* in-Iris code workings : https://github.com/pp-mo/iris/pull/75
207126

208127

209-
# Developer Notes
210-
## Documentation build
211-
* For a full docs-build, a simple `make html` will do for now.
212-
* The ``docs/Makefile`` wipes the API docs and invokes sphinx-apidoc for a full rebuild
213-
* Results are then available at ``docs/_build/html/index.html``
214-
* The above is just for _local testing_ if required :
215-
We have automatic builds for releases and PRs via [ReadTheDocs](https://readthedocs.org/projects/ncdata/)
216-
217-
## Release actions
218-
1. Cut a release on GitHub : this triggers a new docs version on [ReadTheDocs](https://readthedocs.org/projects/ncdata/)
219-
1. Build the distribution
220-
1. if needed, get [build](https://github.com/pypa/build)
221-
2. run `python -m build`
222-
2. Push to PyPI
223-
1. if needed, get [twine](https://github.com/pypa/twine)
224-
2. run `python -m twine upload --repository testpypi dist/*`
225-
* this uploads to TestPyPI
226-
3. create a new env with test dependencies `conda create -n ncdtmp python=3.11 iris xarray filelock requests pytest pip`
227-
(N.B. 'filelock' and 'requests' are _test_ dependencies of iris)
228-
5. install the new package with `pip install --index-url https://test.pypi.org/simple/ ncdata` and run tests
229-
6. if that checks OK, _remove_ `--repository testpypi` _and repeat_ #2
230-
* --> uploads to "real" PyPI
231-
7. repeat #4, _removing_ the `--index-url`, to check that `pip install ncdata` now finds the new version
232-
3. Update conda to source the new version from PyPI
233-
1. create a PR on the [ncdata feedstock](https://github.com/conda-forge/ncdata-feedstock)
234-
1. update :
235-
* [version number](https://github.com/conda-forge/ncdata-feedstock/blob/3f6b35cbdffd2ee894821500f76f2b0b66f55939/recipe/meta.yaml#L2)
236-
* [SHA](https://github.com/conda-forge/ncdata-feedstock/blob/3f6b35cbdffd2ee894821500f76f2b0b66f55939/recipe/meta.yaml#L10)
237-
* Note : the [PyPI reference](https://github.com/conda-forge/ncdata-feedstock/blob/3f6b35cbdffd2ee894821500f76f2b0b66f55939/recipe/meta.yaml#L9) will normally look after itself
238-
* Also : make any required changes to [dependencies](https://github.com/conda-forge/ncdata-feedstock/blob/3f6b35cbdffd2ee894821500f76f2b0b66f55939/recipe/meta.yaml#L17-L29) -- normally _no change required_
239-
1. get PR merged ; wait a few hours ; check the new version appears in `conda search ncdata`

docs/Makefile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,8 @@ help:
1515
.PHONY: help Makefile
1616

1717
allapi:
18-
rm -rf ./api
19-
sphinx-apidoc -Mfe -o ./api ../lib/ncdata
18+
rm -rf ./details/api
19+
sphinx-apidoc -Mfe -o ./details/api ../lib/ncdata
2020

2121
# Catch-all target: route all unknown targets to Sphinx using the new
2222
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).

docs/_templates/repo.html

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
<!-- A github repo link -->
2+
<a class="github reference external" href="https://github.com/pp-mo/ncdata">NcData on GitHub</a>
3+

0 commit comments

Comments
 (0)