Skip to content

Commit 5a239f4

Browse files
authored
Merge pull request #90 from bcdev/forman-66-how_do_i_guide
Added page "How do I ..."
2 parents f2a8222 + 88706e5 commit 5a239f4

File tree

3 files changed

+127
-0
lines changed

3 files changed

+127
-0
lines changed

CHANGES.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,11 @@
66
* Moved all project configuration to `pyproject.toml` and removed
77
`setup.cfg` and requirements files. (#88)
88

9+
* Added new section _How do I ..._ to the documentation. (#66)
10+
911
* Fixed link to _slice sources_ in documentation main page.
1012

13+
1114
## Version 0.7.0 (from 2024-03-19)
1215

1316
* Made writing custom slice sources easier and more flexible: (#82)

docs/howdoi.md

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
# How do I ...
2+
3+
## ... create datacubes from a directory of GeoTIFFs
4+
5+
Files with GeoTIFF format cannot be opened directly by `zappend` unless
6+
you add [rioxarray](https://corteva.github.io/rioxarray/) to your
7+
Python environment.
8+
9+
Then write your own [slice source](guide.md#slice-sources) and
10+
use configuration setting [`slice_source`](config.md#slice_source):
11+
12+
```python
13+
import glob
14+
import numpy as np
15+
import rioxarray as rxr
16+
import xarray as xr
17+
from zappend.api import zappend
18+
19+
def get_dataset_from_geotiff(tiff_path):
20+
ds = rxr.open_rasterio(tiff_path)
21+
# Add missing time dimension
22+
slice_time = get_slice_time(tiff_path)
23+
slice_ds = ds.expand_dims("time", axis=0)
24+
slice_ds.coords["time"] = xr.Dataset(np.array([slice_time]), dims="time")
25+
try:
26+
yield slice_ds
27+
finally:
28+
ds.close()
29+
30+
zappend(sorted(glob.glob("inputs/*.tif")),
31+
slice_source=get_dataset_from_geotiff,
32+
target_dir="output/tif-cube.zarr")
33+
```
34+
35+
In the example above, function `get_slice_time()` returns the time label
36+
of a given GeoTIFF file as a value of type `np.datetime64`.
37+
38+
## ... create datacubes from datasets without append dimension
39+
40+
`zappend` expects the append dimension to exist in slice datasets and
41+
expects that at least one variable exists that makes use of that dimension.
42+
For example, if you are appending spatial 2-d images with dimensions x and y
43+
along a dimension time, you need to first expand the images into the time
44+
dimension. Here the 2-d image dataset is called `image_ds` and `slice_time`
45+
is its associated time value of type `np.datetime64`.
46+
47+
```python
48+
slice_ds = image_ds.expand_dims("time", axis=0)
49+
slice_ds.coords["time"] = xr.Dataset(np.array([slice_time]), dims="time")
50+
```
51+
52+
See also [How do I create datacubes from a directory of GeoTIFFs](#create-datacubes-from-a-directory-of-geotiffs)
53+
above.
54+
55+
## ... dynamically update global metadata attributes
56+
57+
Refer to section about [target attributes](guide.md#attributes)
58+
in the user guide.
59+
60+
## ... find out what is limiting the performance
61+
62+
Use the [logging](guide.md#logging) configuration see which processing steps
63+
use most of the time.
64+
Use the [profiling](guide.md#profiling) configuration to inspect in more
65+
detail which parts of the processing are the bottlenecks.
66+
67+
## ... write a log file
68+
69+
Use the following [logging](guide.md#logging) configuration:
70+
71+
```json
72+
{
73+
"logging": {
74+
"version": 1,
75+
"formatters": {
76+
"normal": {
77+
"format": "%(asctime)s %(levelname)s %(message)s",
78+
"style": "%"
79+
}
80+
},
81+
"handlers": {
82+
"console": {
83+
"class": "logging.StreamHandler",
84+
"formatter": "normal"
85+
},
86+
"file": {
87+
"class": "logging.FileHandler",
88+
"formatter": "normal",
89+
"filename": "zappend.log",
90+
"mode": "w",
91+
"encoding": "utf-8"
92+
}
93+
94+
},
95+
"loggers": {
96+
"zappend": {
97+
"level": "INFO",
98+
"handlers": ["console", "file"]
99+
}
100+
}
101+
}
102+
}
103+
```
104+
105+
## ... address common errors
106+
107+
### Error `Target parent directory does not exist`
108+
109+
For security reasons, `zappend` does not create target directories
110+
automatically. You should make sure the parent directory exists before
111+
calling `zappend`.
112+
113+
### Error `Target is locked`
114+
115+
In this case the target lock file still exists, which means that a former
116+
rollback did not complete nominally. You can no longer trust the integrity of
117+
any existing target dataset. The recommended way is to remove the lock file
118+
and any target datasets artifact. You can do that manually or use the
119+
configuration setting `force_new`.
120+
121+
### Error `Append dimension 'foo' not found in dataset`
122+
123+
Refer to [How do I create datacubes from datasets without append dimension](#create-datacubes-from-datasets-without-append-dimension).

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ nav:
66
- Overview: index.md
77
- Getting Started: start.md
88
- User Guide: guide.md
9+
- How do I ...: howdoi.md
910
- Configuration: config.md
1011
- CLI Reference: cli.md
1112
- API Reference: api.md

0 commit comments

Comments
 (0)