|
5 | 5 | ### Highlights |
6 | 6 |
|
7 | 7 | - License changed from AGPL-3.0-only to **Apache-2.0**. |
8 | | -- **Dataset catalog**: `build()`, `register()`, `register_local()` with a |
9 | | - growing catalog of pre-registered datasets across Earth Search, Planetary |
10 | | - Computer. |
11 | | -- **Multi-cloud obstore backend**: native S3, Azure Blob, and GCS routing |
12 | | - via URL auto-detection. Cross-cloud credential provider guard with |
13 | | - automatic fallback to anonymous access. |
| 8 | +- **Dataset catalog**: `build()` with 13 pre-registered datasets across |
| 9 | + Earth Search, Planetary Computer, and AlphaEarth Foundation. |
| 10 | + `register_local()` for adding your own. |
| 11 | +- **Multi-cloud obstore backend**: S3, Azure Blob, and GCS routing via URL |
| 12 | + auto-detection, with automatic fallback to anonymous access. |
14 | 13 | - **`create_backend()`** for authenticated reads with obstore credential |
15 | | - providers (e.g., Planetary Computer SAS). |
16 | | -- **Local catalog persistence**: `register_local()` persists to |
17 | | - `~/.rasteret/datasets.local.json`; `export_local_descriptor()` for |
18 | | - sharing catalog entries alongside Collections. |
19 | | -- **Torchgeo GeoDataset**: Adapter created that use rasteret's own I/O parts to create a Torchgeo |
20 | | - GeoDataset. |
21 | | -- **Native dtype preservation**: COG tiles return in their source dtype (uint16, int8, |
22 | | - float32, etc.). No forced float32 conversion. |
23 | | -- **Rasterio-aligned masking defaults**: AOI reads now default to `all_touched=False` |
24 | | - and fill masked/outside-coverage pixels with `nodata` when present, otherwise `0`. |
25 | | - The primary read API (`read_cog`) returns a `valid_mask`. |
26 | | -- **rioxarray removed**: CRS encoding uses pyproj CF conventions directly (WKT2, PROJJSON, |
27 | | - GeoTransform). The `xarray` extra no longer pulls rioxarray. |
28 | | -- **Extended TIFF header parsing**: nodata, SamplesPerPixel, PlanarConfiguration, |
29 | | - PhotometricInterpretation, ExtraSamples, GeoDoubleParams CRS support. |
30 | | -- **Cross-CRS masking**: by default, uses the exact transformed polygon (rasterio-aligned). |
31 | | - Optional bbox masking remains available for bbox-style workflows. |
32 | | -- **Multi-CRS auto-reprojection**: queries spanning multiple UTM zones automatically |
33 | | - reproject to the most common CRS. Cross-CRS reprojection uses GDAL's |
34 | | - `calculate_default_transform` for correct resolution handling. |
35 | | - |
| 14 | + providers (e.g., Planetary Computer SAS tokens). |
| 15 | +- **TorchGeo adapter**: `collection.to_torchgeo_dataset()` returns a |
| 16 | + `GeoDataset` backed by Rasteret's async COG reader. Supports |
| 17 | + `time_series=True` (`[T, C, H, W]` output), multi-CRS reprojection, |
| 18 | + and works with all TorchGeo samplers and collation helpers. |
| 19 | +- **Native dtype preservation**: COG tiles return in their source dtype |
| 20 | + (uint16, int8, float32, etc.) instead of forcing float32. |
| 21 | +- **Rasterio-aligned masking**: AOI reads default to `all_touched=False` |
| 22 | + and fill outside-coverage pixels with `nodata` when present, otherwise `0`. |
| 23 | + `read_cog` returns a `valid_mask`. |
| 24 | +- **rioxarray removed**: CRS encoding uses pyproj CF conventions directly. |
| 25 | + The `xarray` extra no longer pulls rioxarray. |
| 26 | +- **Extended TIFF header parsing**: nodata, SamplesPerPixel, |
| 27 | + PlanarConfiguration, PhotometricInterpretation, ExtraSamples, |
| 28 | + GeoDoubleParams CRS support. |
| 29 | +- **Multi-CRS auto-reprojection**: queries spanning multiple UTM zones |
| 30 | + reproject to the most common CRS using GDAL's |
| 31 | + `calculate_default_transform`. |
36 | 32 |
|
37 | 33 | ### Collection API |
38 | 34 |
|
39 | | -- **Collection inspection**: `.bands`, `.bounds`, `.epsg`, `len()`, `__repr__()`, |
40 | | - `.describe()`, `.compare_to_catalog()` for quick metadata access without |
41 | | - materializing the full table. |
42 | | -- **Filtering**: `collection.subset(cloud_cover_lt=..., date_range=..., bbox=..., |
43 | | - geometries=..., split=...)` for friendly filtering; `collection.where(expr)` for |
44 | | - raw Arrow dataset expressions. `select_split()` convenience wrapper. |
| 35 | +- **Inspection**: `.bands`, `.bounds`, `.epsg`, `len()`, `__repr__()`, |
| 36 | + `.describe()`, `.compare_to_catalog()`. |
| 37 | +- **Filtering**: `collection.subset(cloud_cover_lt=..., date_range=..., |
| 38 | + bbox=..., geometries=..., split=...)` and `collection.where(expr)` for |
| 39 | + raw Arrow expressions. |
45 | 40 | - **Sharing**: `collection.export("path/")` writes a portable copy; |
46 | 41 | `rasteret.load("path/")` reloads it. |
47 | 42 |
|
48 | 43 | ### Other changes |
49 | 44 |
|
50 | 45 | - Arrow-native geometry internals (GeoArrow replaces Shapely in hot paths). |
51 | | -- obstore as base dependency for Rust-native HTTP backend. |
52 | | -- CLI: `rasteret collections build|list|info|delete|import`, `rasteret build` shortcut. |
53 | | -- CLI: `rasteret datasets list|info|build|register-local|export-local|unregister-local`. |
| 46 | +- obstore as base dependency (Rust-native async HTTP). |
| 47 | +- CLI: `rasteret collections build|list|info|delete|import`, |
| 48 | + `rasteret datasets list|info|build|register-local|export-local|unregister-local`. |
| 49 | +- TorchGeo `time_series=True` uses spatial-only intersection, matching |
| 50 | + TorchGeo's own `RasterDataset` behaviour where all spatially overlapping |
| 51 | + records are stacked regardless of the sampler's time slice. |
54 | 52 |
|
55 | 53 | ### Tested |
56 | 54 |
|
57 | | -- All three output paths (xarray, GDF, TorchGeo) are tested against direct |
58 | | - rasterio reads across 12 datasets (Sentinel-2, Landsat, NAIP, Copernicus DEM, |
59 | | - ESA WorldCover, AEF, and more). The TorchGeo path uses `rasterio.merge.merge` |
60 | | - as the oracle, matching TorchGeo's own read semantics. See |
61 | | - `test_dataset_pixel_comparison.py` and `test_network_smoke.py`. |
| 55 | +- All three output paths (xarray, GeoDataFrame, TorchGeo) tested against |
| 56 | + direct rasterio reads across 12 datasets (Sentinel-2, Landsat, NAIP, |
| 57 | + Copernicus DEM, ESA WorldCover, AEF, and more). |
62 | 58 |
|
63 | 59 | ### Breaking changes |
64 | 60 |
|
65 | | -- `get_xarray()` returns data in native COG dtype instead of always float32. Code that |
66 | | - assumed float32 output may need adjustment (e.g., `ds.B04.values.dtype` is now `uint16` |
67 | | - for Sentinel-2 instead of `float32`). |
68 | | -- The `xarray` extra no longer installs rioxarray. If you depend on `ds.rio.*` methods, |
69 | | - install rioxarray separately. |
| 61 | +- `get_xarray()` returns data in native COG dtype instead of always float32. |
| 62 | + Code that assumed float32 output may need adjustment. |
| 63 | +- The `xarray` extra no longer installs rioxarray. If you depend on |
| 64 | + `ds.rio.*` methods, install rioxarray separately. |
0 commit comments