@@ -25,17 +25,22 @@ Rasteret parses those headers **once**, caches them in Parquet, and its
2525own reader fetches pixels concurrently with no GDAL in the path.
2626** Up to 20x faster** on cold starts.
2727
28+ Because the index is Parquet, it's not just a cache - it's a table you
29+ work with. Filter by cloud cover or date range, join with your own labels
30+ or AOI polygons, add train/val/test splits as columns, query with DuckDB
31+ or PyArrow. When you need pixels, Rasteret fetches them on demand from the
32+ same table.
33+
2834- ** Easy** - three lines from STAC search or Parquet file to a TorchGeo-compatible dataset
2935- ** Zero downloads** - work with terabytes of imagery while storing only megabytes of metadata
3036- ** No STAC at training time** - query once at setup; zero API calls during training
3137- ** Reproducible** - same Parquet index = same records = same results
3238- ** Native dtypes** - uint16 stays uint16 in tensors; xarray promotes only when NaN fill requires it
33- - ** Shareable cache ** - a few MB index can capture scene selection, band metadata, and split assignments
39+ - ** Your dataset is a table ** - filter, enrich, version, and share a few MB Parquet file. The selection logic lives next to the data references.
3440
35- Rasteret is an ** opt-in accelerator** that integrates with TorchGeo by
36- returning a standard ` GeoDataset ` . Your samplers, DataLoader, xarray
37- workflows, and analysis tools stay the same - Rasteret handles the async
38- tile I/O underneath.
41+ Rasteret integrates with TorchGeo by returning a standard ` GeoDataset ` .
42+ Your samplers, DataLoader, xarray workflows, and analysis tools stay the
43+ same - Rasteret handles the async tile I/O underneath.
3944
4045---
4146
0 commit comments