Skip to content

Commit e1b8bca

Browse files
authored
Merge pull request #64 from d-v-b/feat/sentinel2-pydantic-model
feat/sentinel2 pydantic model
2 parents 187c553 + 20d7fd0 commit e1b8bca

31 files changed

+56507
-1381
lines changed

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,5 +25,5 @@ repos:
2525
exclude: tests/.*
2626
additional_dependencies:
2727
- types-attrs
28+
- typing-extensions>=4.15.0
2829
- pydantic>=2.12
29-
- typing-extensions>=4.15

docs/api-reference.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ Complete reference for the EOPF GeoZarr library's Python API.
99
The main function for converting EOPF datasets to GeoZarr format.
1010

1111
```python
12+
# test: skip
1213
def create_geozarr_dataset(
1314
dt_input: xr.DataTree,
1415
groups: List[str],
@@ -39,6 +40,7 @@ def create_geozarr_dataset(
3940
**Example:**
4041

4142
```python
43+
# test: skip
4244
import xarray as xr
4345
from eopf_geozarr import create_geozarr_dataset
4446

@@ -58,6 +60,7 @@ dt_geozarr = create_geozarr_dataset(
5860
Sets up GeoZarr-compliant metadata for a DataTree.
5961

6062
```python
63+
# test: skip
6164
def setup_datatree_metadata_geozarr_spec_compliant(
6265
dt: xr.DataTree,
6366
geozarr_groups: Dict[str, xr.Dataset]
@@ -69,6 +72,7 @@ def setup_datatree_metadata_geozarr_spec_compliant(
6972
Writes a single group to GeoZarr format with proper metadata.
7073

7174
```python
75+
# test: skip
7276
def write_geozarr_group(
7377
group_path: str,
7478
datasets: Dict[str, xr.Dataset],
@@ -84,6 +88,7 @@ def write_geozarr_group(
8488
Creates multiscales metadata compliant with GeoZarr specification.
8589

8690
```python
91+
# test: skip
8792
def create_geozarr_compliant_multiscales(
8893
datasets: Dict[str, xr.Dataset],
8994
tile_width: int = 256
@@ -97,6 +102,7 @@ def create_geozarr_compliant_multiscales(
97102
Calculates optimal chunk size that aligns with data dimensions.
98103

99104
```python
105+
# test: skip
100106
def calculate_aligned_chunk_size(
101107
dimension_size: int,
102108
target_chunk_size: int
@@ -127,6 +133,7 @@ print(chunk_size) # Returns 3660 (10980 / 3 = 3660)
127133
Downsamples a 2D array by factor of 2 using mean aggregation.
128134

129135
```python
136+
# test: skip
130137
def downsample_2d_array(
131138
data: np.ndarray,
132139
factor: int = 2
@@ -138,6 +145,7 @@ def downsample_2d_array(
138145
Validates existing band data against expected specifications.
139146

140147
```python
148+
# test: skip
141149
def validate_existing_band_data(
142150
dataset: xr.Dataset,
143151
band_name: str,
@@ -151,6 +159,7 @@ def validate_existing_band_data(
151159
### Storage Path Utilities
152160

153161
```python
162+
# test: skip
154163
# Path normalization and validation
155164
def normalize_path(path: str) -> str
156165
def is_s3_path(path: str) -> bool
@@ -164,6 +173,7 @@ def get_s3_storage_options(s3_path: str, **s3_kwargs: Any) -> Dict[str, Any]
164173
### S3 Operations
165174

166175
```python
176+
# test: skip
167177
# S3 store creation and validation
168178
def validate_s3_access(s3_path: str, **s3_kwargs: Any) -> tuple[bool, Optional[str]]
169179
def s3_path_exists(s3_path: str, **s3_kwargs: Any) -> bool
@@ -181,6 +191,7 @@ def read_s3_json_metadata(s3_path: str, **s3_kwargs: Any) -> Dict[str, Any]
181191
### Zarr Operations
182192

183193
```python
194+
# test: skip
184195
# Zarr group operations
185196
def open_zarr_group(path: str, mode: str = "r", **kwargs: Any) -> zarr.Group
186197
def open_s3_zarr_group(s3_path: str, mode: str = "r", **s3_kwargs: Any) -> zarr.Group
@@ -195,6 +206,7 @@ async def async_consolidate_metadata(output_path: str, **storage_kwargs) -> None
195206
### Coordinate Metadata
196207

197208
```python
209+
# test: skip
198210
def _add_coordinate_metadata(ds: xr.Dataset) -> None
199211
```
200212

@@ -207,13 +219,15 @@ Adds proper coordinate metadata including:
207219
### Grid Mapping
208220

209221
```python
222+
# test: skip
210223
def _setup_grid_mapping(ds: xr.Dataset, grid_mapping_var_name: str) -> None
211224
def _add_geotransform(ds: xr.Dataset, grid_mapping_var: str) -> None
212225
```
213226

214227
### CRS and Tile Matrix
215228

216229
```python
230+
# test: skip
217231
def create_native_crs_tile_matrix_set(
218232
crs: Any,
219233
transform: Any,
@@ -230,6 +244,7 @@ Creates a tile matrix set for native CRS (non-Web Mercator).
230244
### calculate_overview_levels
231245

232246
```python
247+
# test: skip
233248
def calculate_overview_levels(
234249
width: int,
235250
height: int,
@@ -242,6 +257,7 @@ Calculates appropriate overview levels based on data dimensions.
242257
### create_overview_dataset_all_vars
243258

244259
```python
260+
# test: skip
245261
def create_overview_dataset_all_vars(
246262
ds: xr.Dataset,
247263
overview_factor: int
@@ -255,6 +271,7 @@ Creates overview dataset with all variables downsampled.
255271
### Retry Logic
256272

257273
```python
274+
# test: skip
258275
def write_dataset_band_by_band_with_validation(
259276
ds: xr.Dataset,
260277
output_path: str,
@@ -270,6 +287,7 @@ Writes dataset with robust error handling and retry logic.
270287
### Coordinate Attributes
271288

272289
```python
290+
# test: skip
273291
def _get_x_coord_attrs() -> Dict[str, Any]
274292
def _get_y_coord_attrs() -> Dict[str, Any]
275293
```
@@ -279,6 +297,7 @@ Returns standard attributes for X and Y coordinates.
279297
### Grid Mapping Detection
280298

281299
```python
300+
# test: skip
282301
def is_grid_mapping_variable(ds: xr.Dataset, var_name: str) -> bool
283302
```
284303

@@ -289,6 +308,7 @@ Determines if a variable is a grid mapping variable.
289308
### Basic Conversion
290309

291310
```python
311+
# test: skip
292312
import xarray as xr
293313
from eopf_geozarr import create_geozarr_dataset
294314

@@ -304,6 +324,7 @@ dt_geozarr = create_geozarr_dataset(
304324
### Advanced S3 Usage
305325

306326
```python
327+
# test: skip
307328
from eopf_geozarr.conversion.fs_utils import (
308329
validate_s3_access,
309330
get_s3_storage_options
@@ -316,7 +337,7 @@ is_valid, error = validate_s3_access(s3_path)
316337
if is_valid:
317338
# Get storage options
318339
storage_opts = get_s3_storage_options(s3_path)
319-
340+
320341
# Convert with S3
321342
dt_geozarr = create_geozarr_dataset(
322343
dt_input=dt,
@@ -329,6 +350,7 @@ if is_valid:
329350
### Custom Chunking
330351

331352
```python
353+
# test: skip
332354
from eopf_geozarr.conversion.utils import calculate_aligned_chunk_size
333355

334356
# Calculate optimal chunks for your data
@@ -348,6 +370,7 @@ dt_geozarr = create_geozarr_dataset(
348370
The library uses comprehensive type hints. Import types as needed:
349371

350372
```python
373+
# test: skip
351374
from typing import Dict, List, Optional, Tuple, Any
352375
import xarray as xr
353376
import numpy as np

docs/architecture.md

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ graph TB
5959
The main conversion engine orchestrates the transformation process:
6060

6161
```python
62+
# test: skip
6263
def create_geozarr_dataset(
6364
dt_input: xr.DataTree,
6465
groups: List[str],
@@ -102,17 +103,19 @@ Core processing algorithms:
102103
**Chunking:**
103104

104105
```python
106+
# test: skip
105107
def calculate_aligned_chunk_size(
106-
dimension_size: int,
108+
dimension_size: int,
107109
target_chunk_size: int
108110
) -> int
109111
```
110112

111113
**Downsampling:**
112114

113115
```python
116+
# test: skip
114117
def downsample_2d_array(
115-
data: np.ndarray,
118+
data: np.ndarray,
116119
factor: int = 2
117120
) -> np.ndarray
118121
```
@@ -218,6 +221,7 @@ y_attrs = {
218221
Each dataset includes proper grid mapping information:
219222

220223
```python
224+
# test: skip
221225
grid_mapping_attrs = {
222226
'grid_mapping_name': 'transverse_mercator', # or appropriate mapping
223227
'projected_crs_name': crs.to_string(),
@@ -232,6 +236,7 @@ grid_mapping_attrs = {
232236
GeoZarr-compliant multiscales structure:
233237

234238
```python
239+
# test: skip
235240
multiscales = [{
236241
'version': '0.4',
237242
'name': group_name,
@@ -289,6 +294,7 @@ def calculate_aligned_chunk_size(dimension_size: int, target_chunk_size: int) ->
289294
**Band-by-Band Processing:**
290295

291296
```python
297+
# test: skip
292298
def write_dataset_band_by_band_with_validation(
293299
ds: xr.Dataset,
294300
output_path: str,
@@ -317,6 +323,7 @@ def write_dataset_band_by_band_with_validation(
317323
The library provides a unified interface for different storage backends:
318324

319325
```python
326+
# test: skip
320327
def get_storage_options(path: str, **kwargs) -> Optional[Dict[str, Any]]:
321328
"""Get storage options based on path type."""
322329
if is_s3_path(path):
@@ -336,6 +343,7 @@ def get_storage_options(path: str, **kwargs) -> Optional[Dict[str, Any]]:
336343
**Configuration:**
337344

338345
```python
346+
# test: skip
339347
s3_options = {
340348
'key': os.environ.get('AWS_ACCESS_KEY_ID'),
341349
'secret': os.environ.get('AWS_SECRET_ACCESS_KEY'),

docs/converter.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ The converter also provides a Python API for programmatic usage:
4545
### Example: Basic Conversion
4646

4747
```python
48+
# test: skip
4849
import xarray as xr
4950
from eopf_geozarr import create_geozarr_dataset
5051

docs/examples.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ Practical examples demonstrating common use cases for the EOPF GeoZarr library.
99
Convert a local EOPF dataset to GeoZarr format:
1010

1111
```python
12+
# test: skip
1213
import xarray as xr
1314
from eopf_geozarr import create_geozarr_dataset
1415

0 commit comments

Comments
 (0)