Skip to content

Commit 6d01185

Browse files
Add initial support for geodatafusion spatial filters. (#31)
* Convert bbox representation to wkb. * Register geodatafusion to support spatial filtering. * Update py03 bindings to datafusion 52. * Align zarrs_icechunk compatible icechunk-python versions. * Include geodatafusion spatial function registration tests. * Update geodatafusion python to 0.2.0 for datafusion 52 compatability. * Fix cargo fmt issues. * Fix ruff error. * Update test data with new VariableLengthBytes bbox array. * Export full API from Python module. * Update deprecated register_table_provider to register_table.
1 parent 682d2bf commit 6d01185

31 files changed

+3748
-2545
lines changed

Cargo.lock

Lines changed: 1197 additions & 799 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,14 @@ version = "0.1.0"
44
edition = "2024"
55

66
[dependencies]
7-
arrow = "56.2.0"
8-
arrow-array = "56.2.0"
9-
arrow-schema = "56.2.0"
7+
arrow = "57.0.0"
8+
arrow-array = "57.0.0"
9+
arrow-schema = "57.0.0"
1010
async-trait = "0.1.89"
11-
datafusion = "50.2"
11+
datafusion = "52.0"
1212
futures = "0.3.31"
13-
geoarrow-schema = "0.6.1"
13+
geodatafusion = "0.3"
14+
geoarrow-schema = "0.7.0"
1415
icechunk = "0.3.16"
1516
object_store = "0.12.4"
1617
thiserror = "2"
@@ -23,5 +24,5 @@ zarrs_object_store = "0.5.0"
2324
zarrs_storage = { version = "0.4.0", features = ["async"] }
2425

2526
[dev-dependencies]
26-
geoarrow-array = "0.6.1"
27+
geoarrow-array = "0.7.0"
2728
tokio = { version = "1.48", features = ["macros", "rt-multi-thread"] }

README.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,7 @@ Users can define arbitrary schemas where the 1-dimensional arrays each use a `dt
1111
- Inside a Zarr group named `"meta"`
1212
- A `datetime64[ms]` array named `"date"` with `n` timestamps named `"date"` with `n` timestamps.
1313
- A `VariableLengthUTF8` array named `"collection"` with `n` string values.
14-
- A `VariableLengthUTF8` array named `"bbox"` with `n` string values, stored as a `VariableLengthUTF8` array, where each string is a WKT-encoded Polygon (or MultiPolygon) with the bounding box of that Zarr record.
15-
16-
In the future, we will likely use a binary encoding like WKB, but Zarr's binary dtype is [not currently well-specified](https://github.com/zarr-developers/zarr-python/issues/3517).
14+
- A `VariableLengthBytes` array named `"bbox"` with `n` binary values, where each value is a WKB-encoded Polygon (or MultiPolygon) with the bounding box of that Zarr record.
1715

1816
This data schema may change over time.
1917

-168 Bytes
Binary file not shown.
-160 Bytes
Binary file not shown.
167 Bytes
Binary file not shown.
160 Bytes
Binary file not shown.
-231 Bytes
Binary file not shown.
234 Bytes
Binary file not shown.
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
{"snapshot":"CMNB1D4K0S56F8Z7KA00"}
1+
{"snapshot":"ZKWW9E6BW6YFNNDKESR0"}

0 commit comments

Comments
 (0)