Geospatial Core Functionality #6411
PhysicsACE
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
This is a discussion to align on what geospatial core functionality should be in daft. For initial support, we want to integrate daft with geoarrow-rs. To support this, we need a UnionArray implementation in daft-core, geospatial specific logical types like point, linestring, etc. and casting support via upstream geoarrow-cast. To add built-in functions, we can use the geoarrow-expr-geo and the geoarrow (version "0.4.0-beta.4") crates. This will allow us to support scalar functions like area, length, centroid as well as predicate functions like contains and intersects. We can also use these crates to compute bounding boxes to support aggregations like st_envelope as well as support explode operations on list based geospatial types. This would create a foundation that supports storing and processing geospatial data in daft. Looking at other geospatial dataframes and libraries, other features that could be added are IO support for Geoparquet or GeoJSON, spatial operators like spatial joins, knn query/join as well as native index support for H3 or S2. While there are many features that could be added to extend geospatial support, I'm wondering how much of this the team intends to have as core functionality vs what could be implemented as an extension.
Beta Was this translation helpful? Give feedback.
All reactions