Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,3 +57,7 @@ jobs:
run: uv sync
- name: Test w/o extras
run: uv run pytest
- name: Check docs
# not worth it to install cairo on macos
if: runner.os == 'ubuntu-latest'
run: uv run mkdocs build --strict
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -163,3 +163,4 @@ cython_debug/
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
.vscode/
docs/generated
44 changes: 23 additions & 21 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 0 additions & 4 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,3 @@ thiserror = "2.0.12"
tokio = { version = "1.44.0", features = ["rt-multi-thread"] }
pyo3-log = "0.12.1"
tracing = "0.1.41"

[patch.crates-io]
duckdb = { git = "https://github.com/duckdb/duckdb-rs", rev = "5eeb1f01c278790ce1e2d24045f0096e9e2528e4" }
libduckdb-sys = { git = "https://github.com/duckdb/duckdb-rs", rev = "5eeb1f01c278790ce1e2d24045f0096e9e2528e4" }
104 changes: 40 additions & 64 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,80 +41,56 @@ conda install conda-forge::stacrs

Then:

```python
```python exec="on" source="above"
import asyncio
import stacrs

# Search a STAC API
items = await stacrs.search(
"https://landsatlook.usgs.gov/stac-server",
collections="landsat-c2l2-sr",
intersects={"type": "Point", "coordinates": [-105.119, 40.173]},
sortby="-properties.datetime",
max_items=100,
)

# If you installed with `pystac[arrow]`:
from geopandas import GeoDataFrame

table = stacrs.to_arrow(items)
data_frame = GeoDataFrame.from_arrow(table)
items = stacrs.from_arrow(data_frame.to_arrow())

# Write items to a stac-geoparquet file
await stacrs.write("items.parquet", items)

# Read items from a stac-geoparquet file as an item collection
item_collection = await stacrs.read("items.parquet")

# You can search geoparquet files using DuckDB
# If you want to search a file on s3, make sure to configure your AWS environment first
item_collection = await stacrs.search("s3://bucket/items.parquet", ...)

# Use `search_to` for better performance if you know you'll be writing the items
# to a file
await stacrs.search_to(
"items.parquet",
"https://landsatlook.usgs.gov/stac-server",
collections="landsat-c2l2-sr",
intersects={"type": "Point", "coordinates": [-105.119, 40.173]},
sortby="-properties.datetime",
max_items=100,
)
async def main() -> None:
# Search a STAC API
items = await stacrs.search(
"https://landsatlook.usgs.gov/stac-server",
collections="landsat-c2l2-sr",
intersects={"type": "Point", "coordinates": [-105.119, 40.173]},
sortby="-properties.datetime",
max_items=100,
)

# If you installed with `pystac[arrow]`:
from geopandas import GeoDataFrame

table = stacrs.to_arrow(items)
data_frame = GeoDataFrame.from_arrow(table)
items = stacrs.from_arrow(data_frame.to_arrow())

# Write items to a stac-geoparquet file
await stacrs.write("/tmp/items.parquet", items)

# Read items from a stac-geoparquet file as an item collection
item_collection = await stacrs.read("/tmp/items.parquet")

# Use `search_to` for better performance if you know you'll be writing the items
# to a file
await stacrs.search_to(
"/tmp/items.parquet",
"https://landsatlook.usgs.gov/stac-server",
collections="landsat-c2l2-sr",
intersects={"type": "Point", "coordinates": [-105.119, 40.173]},
sortby="-properties.datetime",
max_items=100,
)

asyncio.run(main())
```

See [the documentation](https://stac-utils.github.io/stacrs) for details.
In particular, our [example notebook](https://stac-utils.github.io/stacrs/latest/example/) demonstrates some of the more interesting features.
In particular, our [examples](https://stac-utils.github.io/stacrs/latest/examples/) demonstrate some of the more interesting features.

## CLI

**stacrs** comes with a CLI:

```shell
$ stacrs -h
stacrs: A command-line interface for the SpatioTemporal Asset Catalog (STAC)

Usage: stacrs [OPTIONS] <COMMAND>

Commands:
translate Translates STAC from one format to another
search Searches a STAC API or stac-geoparquet file
serve Serves a STAC API
validate Validates a STAC value
help Print this message or the help of the given subcommand(s)

Options:
-i, --input-format <INPUT_FORMAT>
The input format.
--opt <OPTIONS>
Options for getting and putting files from object storage.
-o, --output-format <OUTPUT_FORMAT>
The output format.
-c, --compact-json <COMPACT_JSON>
Whether to print compact JSON output [possible values: true, false]
--parquet-compression <PARQUET_COMPRESSION>
The parquet compression to use when writing stac-geoparquet.
-h, --help
Print help (see more with '--help')
```bash exec="on" source="above" result="text"
stacrs -h
```

> [!NOTE]
Expand Down
1 change: 0 additions & 1 deletion docs/api/migrate.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,3 @@ description: Migrate STAC to another version
# Migration

::: stacrs.migrate
::: stacrs.migrate_href
563 changes: 0 additions & 563 deletions docs/example.ipynb

This file was deleted.

3 changes: 3 additions & 0 deletions docs/examples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Examples

Examples of using **stacrs**.
27 changes: 27 additions & 0 deletions docs/examples/example_read.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# type: ignore
"""
# Reading and plotting
"""

# %%
# Reading is done via a top-level async function.
import stacrs

items = await stacrs.read("https://github.com/stac-utils/stacrs/raw/refs/heads/main/data/100-sentinel-2-items.parquet")
items

# %%
# Let's take a look some of the attributes of the STAC items.
import pandas
from geopandas import GeoDataFrame

data_frame = GeoDataFrame.from_features(items)
data_frame["datetime"] = pandas.to_datetime(data_frame["datetime"])
data_frame[["geometry", "datetime", "s2:snow_ice_percentage"]]

# %%
# How does the snow and ice percentage vary over the year?
from matplotlib.dates import DateFormatter

axis = data_frame.plot(x="datetime", y="s2:snow_ice_percentage", kind="scatter")
axis.xaxis.set_major_formatter(DateFormatter("%b"))
48 changes: 48 additions & 0 deletions docs/examples/example_search.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# type: ignore
"""
# Searching
"""

# %%
# Search a STAC API with `stacrs.search`:
import contextily
import pandas
import stacrs
from geopandas import GeoDataFrame

items = await stacrs.search(
"https://stac.eoapi.dev",
collections="MAXAR_Marshall_Fire_21_Update"
)
data_frame = GeoDataFrame.from_features(items)
data_frame["datetime"] = pandas.to_datetime(data_frame["datetime"])
axis = data_frame.set_crs(epsg=4326).to_crs(epsg=3857).plot(alpha=0.5, edgecolor="k")
contextily.add_basemap(axis, source=contextily.providers.CartoDB.Positron)
axis.set_axis_off()

# %%
# Search [stac-geoparquet](https://github.com/stac-utils/stac-geoparquet/blob/main/spec/stac-geoparquet-spec.md) with [DuckDB](https://duckdb.org/), no servers required!

items = await stacrs.search(
"../../data/100-sentinel-2-items.parquet",
datetime="2024-12-01T00:00:00Z/..",
)
data_frame = GeoDataFrame.from_features(items)
data_frame["datetime"] = pandas.to_datetime(data_frame["datetime"])
data_frame[["datetime", "geometry"]]

# %%
# If you know you're going to a [geopandas.GeoDataFrame][] (or something else that speaks
# arrow), you can use the `arrow` optional dependency for **stacrs** (`pip
# install 'stacrs[arrow]'`) and search directly to arrow, which can be more
# efficient than going through JSON dictionaries:

from stacrs import DuckdbClient

client = DuckdbClient()
table = client.search_to_arrow(
"../../data/100-sentinel-2-items.parquet",
datetime="2024-12-01T00:00:00Z/..",
)
data_frame = GeoDataFrame.from_arrow(table)
data_frame[["datetime", "geometry"]]
Loading
Loading