Skip to content

Commit 25d778b

Browse files
committed
docs: improve
1 parent 271f7c8 commit 25d778b

File tree

2 files changed

+30
-8
lines changed

2 files changed

+30
-8
lines changed

README.md

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -25,17 +25,22 @@ Rasteret parses those headers **once**, caches them in Parquet, and its
2525
own reader fetches pixels concurrently with no GDAL in the path.
2626
**Up to 20x faster** on cold starts.
2727

28+
Because the index is Parquet, it's not just a cache - it's a table you
29+
work with. Filter by cloud cover or date range, join with your own labels
30+
or AOI polygons, add train/val/test splits as columns, query with DuckDB
31+
or PyArrow. When you need pixels, Rasteret fetches them on demand from the
32+
same table.
33+
2834
- **Easy** - three lines from STAC search or Parquet file to a TorchGeo-compatible dataset
2935
- **Zero downloads** - work with terabytes of imagery while storing only megabytes of metadata
3036
- **No STAC at training time** - query once at setup; zero API calls during training
3137
- **Reproducible** - same Parquet index = same records = same results
3238
- **Native dtypes** - uint16 stays uint16 in tensors; xarray promotes only when NaN fill requires it
33-
- **Shareable cache** - a few MB index can capture scene selection, band metadata, and split assignments
39+
- **Your dataset is a table** - filter, enrich, version, and share a few MB Parquet file. The selection logic lives next to the data references.
3440

35-
Rasteret is an **opt-in accelerator** that integrates with TorchGeo by
36-
returning a standard `GeoDataset`. Your samplers, DataLoader, xarray
37-
workflows, and analysis tools stay the same - Rasteret handles the async
38-
tile I/O underneath.
41+
Rasteret integrates with TorchGeo by returning a standard `GeoDataset`.
42+
Your samplers, DataLoader, xarray workflows, and analysis tools stay the
43+
same - Rasteret handles the async tile I/O underneath.
3944

4045
---
4146

docs/index.md

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,13 @@
1717
!!! success "What Rasteret does"
1818

1919
Parse headers **once**, cache in Parquet, read pixels concurrently
20-
with no GDAL in the path.
20+
with no GDAL in the path. Because the index is Parquet, it's also
21+
the table you work with - filter, join, enrich, and query with
22+
standard tools before you ever fetch a pixel.
2123

2224
```text
23-
STAC API / GeoParquet --> Parquet Index --> Tile-level byte reads
24-
(once) (queryable) (no GDAL, no headers)
25+
STAC API / GeoParquet --> Collection (Parquet) --> Tile-level byte reads
26+
(once) (queryable, enrichable) (no GDAL, no headers)
2527
```
2628

2729
---
@@ -56,6 +58,21 @@
5658
Same Parquet index = same records = same results.
5759
Share a few MB file and collaborators skip re-indexing.
5860

61+
- :material-table-edit:{ .lg .middle } **Your dataset is a table**
62+
63+
---
64+
65+
Filter, join, enrich with DuckDB or PyArrow. Add splits,
66+
labels, and quality flags as columns. The index is the dataset.
67+
68+
- :material-swap-horizontal:{ .lg .middle } **Any Parquet with COG URLs**
69+
70+
---
71+
72+
`build_from_table()` turns existing GeoParquet into a
73+
Collection. Source Cooperative exports, STAC GeoParquet,
74+
custom catalogs - if it has URLs, Rasteret can read it.
75+
5976
</div>
6077

6178
---

0 commit comments

Comments
 (0)