Skip to content

Commit da71b6d

Browse files
authored
V1 documentation (#3)
* Copy documentation from unification to docs branch * Bump version * Resolve docs-build issues
1 parent 5071486 commit da71b6d

31 files changed

+2909
-2813
lines changed

.readthedocs.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
version: 2
22
build:
3-
os: ubuntu-20.04
3+
os: ubuntu-22.04
44
tools:
5-
python: "3.11"
5+
python: "3.12"
66
sphinx:
77
configuration: docs/conf.py
88
formats: all

docs/reference.md renamed to docs/api_reference.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Reference
1+
# API Reference
22

33
## Readers / Writers
44

docs/usage.md renamed to docs/cli_usage.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Usage
1+
# CLI Usage
22

33
## Ingestion and Export
44

docs/conf.py

Lines changed: 55 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,85 @@
11
"""Sphinx configuration."""
22

3+
# -- Project information -----------------------------------------------------
4+
35
project = "MDIO"
46
author = "TGS"
5-
copyright = "2023, TGS"
7+
copyright = "2023, TGS" # noqa: A001
8+
9+
# -- General configuration ---------------------------------------------------
10+
11+
# Add any Sphinx extension module names here, as strings. They can be
12+
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
13+
# ones.
14+
615
extensions = [
716
"sphinx.ext.autodoc",
817
"sphinx.ext.napoleon",
18+
"sphinx.ext.intersphinx",
19+
"sphinx.ext.autosummary",
20+
"sphinxcontrib.autodoc_pydantic",
921
"sphinx.ext.autosectionlabel",
1022
"sphinx_click",
1123
"sphinx_copybutton",
1224
"myst_nb",
25+
"sphinx_design",
1326
]
1427

28+
# List of patterns, relative to source directory, that match files and
29+
# directories to ignore when looking for source files.
30+
# This pattern also affects html_static_path and html_extra_path.
31+
exclude_patterns = [
32+
"_build",
33+
"Thumbs.db",
34+
"jupyter_execute",
35+
".DS_Store",
36+
"**.ipynb_checkpoints",
37+
]
38+
39+
html_theme = "furo"
40+
pygments_style = "vs"
41+
pygments_dark_style = "material"
42+
43+
# -- Intersphinx configuration -----------------------------------------------
44+
45+
intersphinx_mapping = {
46+
"python": ("https://docs.python.org/3", None),
47+
"numpy": ("https://numpy.org/doc/stable/", None),
48+
"pydantic": ("https://docs.pydantic.dev/latest/", None),
49+
"zarr": ("https://zarr.readthedocs.io/en/stable/", None),
50+
}
51+
52+
# -- Autodoc configuration ---------------------------------------------------
53+
1554
autodoc_typehints = "description"
1655
autodoc_typehints_format = "short"
1756
autodoc_member_order = "groupwise"
18-
autoclass_content = "both"
57+
autoclass_content = "class"
1958
autosectionlabel_prefix_document = True
2059

21-
html_theme = "furo"
60+
autodoc_pydantic_field_list_validators = False
61+
autodoc_pydantic_field_swap_name_and_alias = True
62+
autodoc_pydantic_field_show_alias = False
63+
autodoc_pydantic_model_show_config_summary = False
64+
autodoc_pydantic_model_show_validator_summary = False
65+
autodoc_pydantic_model_show_validator_members = False
66+
autodoc_pydantic_model_show_field_summary = False
67+
68+
# -- MyST configuration ------------------------------------------------------
2269

2370
myst_number_code_blocks = ["python"]
2471
myst_heading_anchors = 2
72+
myst_words_per_minute = 80
2573
myst_enable_extensions = [
74+
"colon_fence",
2675
"linkify",
2776
"replacements",
2877
"smartquotes",
78+
"attrs_inline",
2979
]
3080

31-
# sphinx-copybutton configurations
81+
# -- Sphinx Copybutton configuration -----------------------------------------
82+
3283
copybutton_prompt_text = r">>> |\.\.\. |\$ |In \[\d*\]: | {2,5}\.\.\.: | {5,8}: "
3384
copybutton_line_continuation_character = "\\"
3485
copybutton_prompt_is_regexp = True

docs/data_models/chunk_grids.md

Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
```{eval-rst}
2+
:tocdepth: 3
3+
```
4+
5+
```{currentModule} mdio.schema.chunk_grid
6+
7+
```
8+
9+
# Chunk Grid Models
10+
11+
```{article-info}
12+
:author: Altay Sansal
13+
:date: "{sub-ref}`today`"
14+
:read-time: "{sub-ref}`wordcount-minutes` min read"
15+
:class-container: sd-p-0 sd-outline-muted sd-rounded-3 sd-font-weight-light
16+
```
17+
18+
The variables in MDIO data model can represent different types of chunk grids.
19+
These grids are essential for managing multi-dimensional data arrays efficiently.
20+
In this breakdown, we will explore four distinct data models within the MDIO schema,
21+
each serving a specific purpose in data handling and organization.
22+
23+
MDIO implements data models following the guidelines of the Zarr v3 spec and ZEPs:
24+
25+
- [Zarr core specification (version 3)](https://zarr-specs.readthedocs.io/en/latest/v3/core/v3.0.html)
26+
- [ZEP 1 — Zarr specification version 3](https://zarr.dev/zeps/accepted/ZEP0001.html)
27+
- [ZEP 3 — Variable chunking](https://zarr.dev/zeps/draft/ZEP0003.html)
28+
29+
## Regular Grid
30+
31+
The regular grid models are designed to represent a rectangular and regularly
32+
paced chunk grid.
33+
34+
```{eval-rst}
35+
.. autosummary::
36+
RegularChunkGrid
37+
RegularChunkShape
38+
```
39+
40+
For 1D array with `size = 31`{l=python}, we can divide it into 5 equally sized
41+
chunks. Note that the last chunk will be truncated to match the size of the array.
42+
43+
`{ "name": "regular", "configuration": { "chunkShape": [7] } }`{l=json}
44+
45+
Using the above schema resulting array chunks will look like this:
46+
47+
```bash
48+
←─ 7 ─→ ←─ 7 ─→ ←─ 7 ─→ ←─ 7 ─→ ↔ 3
49+
┌───────┬───────┬───────┬───────┬───┐
50+
└───────┴───────┴───────┴───────┴───┘
51+
```
52+
53+
For 2D array with shape `rows, cols = (7, 17)`{l=python}, we can divide it into 9
54+
equally sized chunks.
55+
56+
`{ "name": "regular", "configuration": { "chunkShape": [3, 7] } }`{l=json}
57+
58+
Using the above schema, the resulting 2D array chunks will look like below.
59+
Note that the rows and columns are conceptual and visually not to scale.
60+
61+
```bash
62+
←─ 7 ─→ ←─ 7 ─→ ↔ 3
63+
┌───────┬───────┬───┐
64+
│ ╎ ╎ │ ↑
65+
│ ╎ ╎ │ 3
66+
│ ╎ ╎ │ ↓
67+
├╶╶╶╶╶╶╶┼╶╶╶╶╶╶╶┼╶╶╶┤
68+
│ ╎ ╎ │ ↑
69+
│ ╎ ╎ │ 3
70+
│ ╎ ╎ │ ↓
71+
├╶╶╶╶╶╶╶┼╶╶╶╶╶╶╶┼╶╶╶┤
72+
│ ╎ ╎ │ ↕ 1
73+
└───────┴───────┴───┘
74+
```
75+
76+
## Rectilinear Grid
77+
78+
The [RectilinearChunkGrid](RectilinearChunkGrid) model extends
79+
the concept of chunk grids to accommodate rectangular and irregularly spaced chunks.
80+
This model is useful in data structures where non-uniform chunk sizes are necessary.
81+
[RectilinearChunkShape](RectilinearChunkShape) specifies the chunk sizes for each
82+
dimension as a list allowing for irregular intervals.
83+
84+
```{eval-rst}
85+
.. autosummary::
86+
RectilinearChunkGrid
87+
RectilinearChunkShape
88+
```
89+
90+
:::{note}
91+
It's important to ensure that the sum of the irregular spacings specified
92+
in the `chunkShape` matches the size of the respective array dimension.
93+
:::
94+
95+
For 1D array with `size = 39`{l=python}, we can divide it into 5 irregular sized
96+
chunks.
97+
98+
`{ "name": "rectilinear", "configuration": { "chunkShape": [[10, 7, 5, 7, 10]] } }`{l=json}
99+
100+
Using the above schema resulting array chunks will look like this:
101+
102+
```bash
103+
←── 10 ──→ ←─ 7 ─→ ← 5 → ←─ 7 ─→ ←── 10 ──→
104+
┌──────────┬───────┬─────┬───────┬──────────┐
105+
└──────────┴───────┴─────┴───────┴──────────┘
106+
```
107+
108+
For 2D array with shape `rows, cols = (7, 25)`{l=python}, we can divide it into 12
109+
rectilinear (rectangular bur irregular) chunks. Note that the rows and columns are
110+
conceptual and visually not to scale.
111+
112+
`{ "name": "rectilinear", "configuration": { "chunkShape": [[3, 1, 3], [10, 5, 7, 3]] } }`{l=json}
113+
114+
```bash
115+
←── 10 ──→ ← 5 → ←─ 7 ─→ ↔ 3
116+
┌──────────┬─────┬───────┬───┐
117+
│ ╎ ╎ ╎ │ ↑
118+
│ ╎ ╎ ╎ │ 3
119+
│ ╎ ╎ ╎ │ ↓
120+
├╶╶╶╶╶╶╶╶╶╶┼╶╶╶╶╶┼╶╶╶╶╶╶╶┼╶╶╶┤
121+
│ ╎ ╎ ╎ │ ↕ 1
122+
├╶╶╶╶╶╶╶╶╶╶┼╶╶╶╶╶┼╶╶╶╶╶╶╶┼╶╶╶┤
123+
│ ╎ ╎ ╎ │ ↑
124+
│ ╎ ╎ ╎ │ 3
125+
│ ╎ ╎ ╎ │ ↓
126+
└──────────┴─────┴───────┴───┘
127+
```
128+
129+
## Model Reference
130+
131+
:::{dropdown} RegularChunkGrid
132+
:animate: fade-in-slide-down
133+
134+
```{eval-rst}
135+
.. autopydantic_model:: RegularChunkGrid
136+
137+
----------
138+
139+
.. autopydantic_model:: RegularChunkShape
140+
```
141+
142+
:::
143+
:::{dropdown} RectilinearChunkGrid
144+
:animate: fade-in-slide-down
145+
146+
```{eval-rst}
147+
.. autopydantic_model:: RectilinearChunkGrid
148+
149+
----------
150+
151+
.. autopydantic_model:: RectilinearChunkShape
152+
```
153+
154+
:::

docs/data_models/compressors.md

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
```{eval-rst}
2+
:tocdepth: 3
3+
```
4+
5+
```{currentModule} mdio.schema.compressors
6+
7+
```
8+
9+
# Compressors
10+
11+
```{article-info}
12+
:author: Altay Sansal
13+
:date: "{sub-ref}`today`"
14+
:read-time: "{sub-ref}`wordcount-minutes` min read"
15+
:class-container: sd-p-0 sd-outline-muted sd-rounded-3 sd-font-weight-light
16+
```
17+
18+
## Dataset Compression
19+
20+
MDIO relies on [numcodecs] for data compression. We provide good defaults based
21+
on opinionated and limited heuristics for each compressor for various energy datasets.
22+
However, using these data models, the compression can be customized.
23+
24+
[Numcodecs] is a project that a convenient interface to different compression
25+
libraries. We selected the [Blosc] and [ZFP] compressors for lossless and lossy
26+
compression of energy data.
27+
28+
## Blosc
29+
30+
A high-performance compressor optimized for binary data, combining fast compression
31+
with a byte-shuffle filter for enhanced efficiency, particularly effective with
32+
numerical arrays in multi-threaded environments.
33+
34+
For more details about compression modes, see [Blosc Documentation].
35+
36+
```{eval-rst}
37+
.. autosummary::
38+
Blosc
39+
```
40+
41+
## ZFP
42+
43+
ZFP is a compression algorithm tailored for floating-point and integer arrays, offering
44+
lossy and lossless compression with customizable precision, well-suited for large
45+
scientific datasets with a focus on balancing data fidelity and compression ratio.
46+
47+
For more details about compression modes, see [ZFP Documentation].
48+
49+
```{eval-rst}
50+
.. autosummary::
51+
ZFP
52+
```
53+
54+
[numcodecs]: https://github.com/zarr-developers/numcodecs
55+
[blosc]: https://github.com/Blosc/c-blosc
56+
[blosc documentation]: https://www.blosc.org/python-blosc/python-blosc.html
57+
[zfp]: https://github.com/LLNL/zfp
58+
[zfp documentation]: https://computing.llnl.gov/projects/zfp
59+
60+
## Model Reference
61+
62+
:::
63+
:::{dropdown} Blosc
64+
:animate: fade-in-slide-down
65+
66+
```{eval-rst}
67+
.. autopydantic_model:: Blosc
68+
69+
----------
70+
71+
.. autoclass:: BloscAlgorithm()
72+
:members:
73+
:undoc-members:
74+
:member-order: bysource
75+
76+
----------
77+
78+
.. autoclass:: BloscShuffle()
79+
:members:
80+
:undoc-members:
81+
:member-order: bysource
82+
```
83+
84+
:::
85+
86+
:::{dropdown} ZFP
87+
:animate: fade-in-slide-down
88+
89+
```{eval-rst}
90+
.. autopydantic_model:: ZFP
91+
92+
----------
93+
94+
.. autoclass:: ZFPMode()
95+
:members:
96+
:undoc-members:
97+
:member-order: bysource
98+
```
99+
100+
:::

0 commit comments

Comments
 (0)