Skip to content

Commit 627d455

Browse files
committed
add better messagin
1 parent 8b4c9e6 commit 627d455

File tree

3 files changed

+24
-0
lines changed

3 files changed

+24
-0
lines changed

README.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,13 @@ Rasteret parses those headers **once**, caches them in Parquet, and its
2525
own reader fetches pixels concurrently with no GDAL in the path.
2626
**Up to 20x faster** on cold starts.
2727

28+
We call this pattern **index-first geospatial image retrieval**:
29+
30+
- **Control plane**: a queryable Parquet index (scene metadata, COG header metadata, user columns like splits/labels)
31+
- **Data plane**: on-demand tile reads from the original GeoTIFF/COG objects
32+
33+
This keeps metadata and experiment logic in tables while leaving imagery bytes in source COGs.
34+
2835
- **Easy** - three lines from STAC search or Parquet file to a TorchGeo-compatible dataset
2936
- **Zero downloads** - work with terabytes of imagery while storing only megabytes of metadata
3037
- **No STAC at training time** - query once at setup; zero API calls during training

docs/explanation/design-decisions.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,16 @@ This page documents the key design choices behind Rasteret and the reasoning
44
that drives them. It is aimed at contributors and advanced users who want to
55
understand *why* things work the way they do.
66

7+
## Category framing
8+
9+
Rasteret follows an **index-first geospatial retrieval** architecture:
10+
11+
- **Control plane (tables/Parquet)**: discovery, filtering, splits/labels, and cached COG header metadata.
12+
- **Data plane (COG object storage)**: on-demand tile byte reads from source GeoTIFFs.
13+
14+
This separation is intentional. It preserves table interoperability for metadata
15+
workflows while avoiding payload-in-Parquet duplication for routine pixel reads.
16+
717
## Why Parquet indexes?
818

919
Remote COGs require an HTTP HEAD + IFD range read per file just to discover

docs/index.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,13 @@
2424
(once) (queryable) (no GDAL, no headers)
2525
```
2626

27+
!!! info "Category: index-first geospatial retrieval"
28+
29+
Rasteret treats Parquet as the **control plane** (scene metadata + COG
30+
header metadata + user-enriched columns), and COG object storage as the
31+
**data plane** (pixel bytes fetched on demand). Metadata stays table-native;
32+
imagery bytes stay in source COGs.
33+
2734
---
2835

2936
<div class="grid cards" markdown>

0 commit comments

Comments
 (0)