Skip to content

Commit cac322a

Browse files
authored
Bump for Python 0.2 (#105)
* Bump for Python 0.2 * Update changelog
1 parent 9350af4 commit cac322a

File tree

9 files changed

+182
-13
lines changed

9 files changed

+182
-13
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ A Rust crate for packed, immutable, zero-copy spatial indexes.
1717

1818
- **An R-tree and k-d tree written in safe rust.**
1919
- **Fast.** Because of optimizations available by using immutable indexes, tends to be faster than dynamic implementations like [`rstar`](https://github.com/georust/rstar).
20-
- **Memory efficient.** The index is fully _packed_, meaning that all nodes are at full capacity (except for the last node at each tree level). This means the RTree and k-d tree use less memory. And because the index is backed by a single buffer, it exhibits excellent memory locality. For any number of input geometries, the peak memory required both to build the index and to store the index can be pre-computed.
20+
- **Memory efficient.** The index is fully _packed_, meaning that all nodes are at full capacity (except for the last node at each tree level). This means the RTree and k-d tree use less memory. And because the index is backed by a single buffer, it exhibits excellent memory locality.
2121
- **Bounded memory**. For any given number of items and node size, you can infer the total memory used by the RTree or KDTree.
2222
- **Multiple R-tree sorting methods.** Currently, [hilbert](https://en.wikipedia.org/wiki/Hilbert_R-tree) and [sort-tile-recursive (STR)](https://ia600900.us.archive.org/27/items/nasa_techdoc_19970016975/19970016975.pdf) sorting methods are implemented, but it's extensible to other spatial sorting algorithms in the future, like [overlap-minimizing top-down (OMT)](https://ceur-ws.org/Vol-74/files/FORUM_18.pdf).
2323
- **ABI-stable:** the index is contained in a single `Vec<u8>`, compatible with the [`flatbush`](https://github.com/mourner/flatbush) and [`kdbush`](https://github.com/mourner/kdbush) JavaScript libraries. Being ABI-stable means that the spatial index can be persisted for later use or shared zero-copy between Rust and another program like Python.

python/CHANGELOG.md

Lines changed: 21 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,26 @@
11
# Changelog
22

3-
## Unreleased
4-
5-
- Raise runtime warning for debug builds by @kylebarron in https://github.com/kylebarron/geo-index/pull/63
6-
- RTree Buffer protocol, python binding tests by @H-Plus-Time in https://github.com/kylebarron/geo-index/pull/55
7-
- Python: Return boxes as arrow from RTree by @kylebarron in https://github.com/kylebarron/geo-index/pull/89
8-
- Update Python API & implement RTree partitions, nearest neighbor search by @kylebarron in https://github.com/kylebarron/geo-index/pull/87
9-
- Update python bindings by @kylebarron in https://github.com/kylebarron/geo-index/pull/78
10-
- Add `__repr__` to Python classes by @kylebarron in https://github.com/kylebarron/geo-index/pull/84
11-
- Python docs website & improved rtree & kdtree docs by @kylebarron in https://github.com/kylebarron/geo-index/pull/93
3+
## [0.2.0] - 2025-01-06
4+
5+
### New Features
6+
7+
- Support for nearest neighbor searching on RTrees with [`neighbors`](https://kylebarron.dev/geo-index/v0.2.0/api/rtree/#geoindex_rs.rtree.neighbors).
8+
- Join two RTrees together with [`tree_join`](https://kylebarron.dev/geo-index/v0.2.0/api/rtree/#geoindex_rs.rtree.tree_join), finding their overlapping elements. This is the first part of a spatial join: to find which elements from two different data sources potentially intersect.
9+
- Extract partitioning structure from the underlying RTree with [`partitions`](https://kylebarron.dev/geo-index/v0.2.0/api/rtree/#geoindex_rs.rtree.partitions) and see the partition geometries with [`partition_boxes`](https://kylebarron.dev/geo-index/v0.2.0/api/rtree/#geoindex_rs.rtree.partition_boxes).
10+
- Expose [`RTreeMetadata`](https://kylebarron.dev/geo-index/v0.2.0/api/rtree/#geoindex_rs.rtree.RTreeMetadata) and [`KDTreeMetadata`](https://kylebarron.dev/geo-index/v0.2.0/api/kdtree/#geoindex_rs.kdtree.KDTreeMetadata). These allow you to infer the memory usage a tree would incur.
11+
- Access the internal boxes within the RTree for inspecting the tree internals with `boxes_at_level`.
12+
- Implement the buffer protocol on `RTree` and `KDTree`. This means you can copy the underlying buffer to Python with `bytes(tree)`.
13+
14+
### Breaking
15+
16+
- **Move RTree and KDTree query functions to standalone global functions**. This
17+
makes it easier to persist index buffers and reuse them later, because the
18+
query functions work on any object supporting the buffer protocol.
19+
- **Create "builder" classes**: `RTreeBuilder` and `KDTreeBuilder`. Having these as separate classes allows for iteratively adding the coordinates for an RTree or KDTree. This is useful when the source geometries are larger than fits in memory.
20+
21+
### Documentation
22+
23+
- New documentation website for Python bindings.
1224

1325
## [0.1.0] - 2024-03-26
1426

python/Cargo.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

python/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "geoindex-rs"
3-
version = "0.2.0-beta.1"
3+
version = "0.2.0"
44
authors = ["Kyle Barron <[email protected]>"]
55
edition = "2021"
66
description = "Fast, memory-efficient 2D spatial indexes for Python."

python/README.md

Lines changed: 105 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,20 @@
55
[pypi_badge]: https://badge.fury.io/py/geoindex-rs.svg
66
[pypi_link]: https://pypi.org/project/geoindex-rs/
77

8-
Fast, memory-efficient 2D spatial indexes for Python.
8+
Fast, memory-efficient, zero-copy spatial indexes for Python.
99

1010
This documentation is for the Python bindings. [Refer here for the Rust crate documentation](https://docs.rs/geo-index).
1111

12+
## Features
13+
14+
- **An R-tree and k-d tree written in Rust, compiled for Python.**
15+
- **Fast.** The Rust core and immutability lends the spatial indexes to be very fast. Additionally, building the indexes accepts vectorized Numpy or [Arrow](https://arrow.apache.org/) input.
16+
- **Memory efficient.** The index is fully _packed_, meaning that all nodes are at full capacity (except for the last node at each tree level). This means the RTree and k-d tree use less memory.
17+
- **Bounded memory**. For any given number of items and node size, you can infer the total memory used by the RTree or KDTree.
18+
- **Multiple R-tree sorting methods.** Currently, [hilbert](https://en.wikipedia.org/wiki/Hilbert_R-tree) and [sort-tile-recursive (STR)](https://ia600900.us.archive.org/27/items/nasa_techdoc_19970016975/19970016975.pdf) sorting methods are implemented.
19+
- **ABI-stable:** the index is contained in a single buffer, compatible with the [`flatbush`](https://github.com/mourner/flatbush) and [`kdbush`](https://github.com/mourner/kdbush) JavaScript libraries. Being ABI-stable means that the spatial index can be persisted for later use or shared zero-copy between Rust and Python.
20+
- **Supports float64 or float32 coordinates:** for 2x memory savings, use float32 coordinates in the spatial index.
21+
1222
## Install
1323

1424
```
@@ -20,3 +30,97 @@ or with Conda:
2030
```
2131
conda install geoindex-rs
2232
```
33+
34+
## Examples
35+
36+
Building an RTree and searching for nearest neighbors.
37+
38+
```py
39+
import numpy as np
40+
from geoindex_rs import rtree as rt
41+
42+
# Three bounding boxes
43+
min_x = np.arange(0, 3)
44+
min_y = np.arange(0, 3)
45+
max_x = np.arange(2, 5)
46+
max_y = np.arange(2, 5)
47+
48+
# When creating the builder, the total number of items must be passed
49+
builder = rt.RTreeBuilder(num_items=3)
50+
51+
# Add the bounding boxes to the builder
52+
builder.add(min_x, min_y, max_x, max_y)
53+
54+
# Consume the builder (sorting the index) and create the RTree.
55+
tree = builder.finish()
56+
57+
# Find the nearest neighbors in the RTree to the point (5, 5)
58+
results = rt.neighbors(tree, 5, 5)
59+
60+
# For performance, results are returned as an Arrow array.
61+
assert results.to_pylist() == [2, 1, 0]
62+
```
63+
64+
Building a KDTree and searching within a bounding box.
65+
66+
```py
67+
import numpy as np
68+
from geoindex_rs import kdtree as kd
69+
70+
# Three points: (0, 2), (1, 3), (2, 4)
71+
x = np.arange(0, 3)
72+
y = np.arange(2, 5)
73+
74+
# When creating the builder, the total number of items must be passed
75+
builder = kd.KDTreeBuilder(3)
76+
77+
# Add the points to the builder
78+
builder.add(x, y)
79+
80+
# Consume the builder (sorting the index) and create the KDTree.
81+
tree = builder.finish()
82+
83+
# Search within this bounding box:
84+
results = kd.range(tree, 2, 4, 7, 9)
85+
86+
# For performance, results are returned as an Arrow array.
87+
assert results.to_pylist() == [2]
88+
```
89+
90+
## Persisting the spatial index
91+
92+
The `RTree` and `KDTree` classes implement the Python buffer protocol, so you
93+
can pass an instance of the index directly to `bytes` to copy the underlying
94+
spatial index into a buffer. Then you can save that buffer somewhere, load it
95+
again, and use it directly for queries!
96+
97+
```py
98+
import numpy as np
99+
from geoindex_rs import rtree as rt
100+
101+
min_x = np.arange(0, 3)
102+
min_y = np.arange(0, 3)
103+
max_x = np.arange(2, 5)
104+
max_y = np.arange(2, 5)
105+
106+
builder = rt.RTreeBuilder(num_items=3)
107+
builder.add(min_x, min_y, max_x, max_y)
108+
tree = builder.finish()
109+
110+
# Copy to a Python bytes object
111+
copied_tree = bytes(tree)
112+
113+
# The entire RTree is contained within this 144 byte buffer
114+
assert len(copied_tree) == 144
115+
116+
# We can use the bytes object (or anything else implementing the Python buffer
117+
# protocol) directly in searches
118+
results = rt.neighbors(copied_tree, 5, 5)
119+
assert results.to_pylist() == [2, 1, 0]
120+
```
121+
122+
## Drawbacks
123+
124+
- Trees are _immutable_. After creating the index, items can no longer be added or removed.
125+
- Only two-dimensional indexes is supported. This can still be used with higher-dimensional input data as long as it's fine to only index two of the dimensions.
126+
- Queries return insertion indexes into the input set, so you must manage your own collections.

python/docs/CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../CHANGELOG.md

python/mkdocs.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ nav:
2424
- API Reference:
2525
- api/rtree.md
2626
- api/kdtree.md
27+
- Changelog: CHANGELOG.md
2728

2829
watch:
2930
- python
@@ -96,6 +97,7 @@ plugins:
9697
- griffe_inherited_docstrings
9798

9899
import:
100+
- https://arrow.apache.org/docs/objects.inv
99101
- https://docs.python.org/3/objects.inv
100102
- https://kylebarron.dev/arro3/latest/objects.inv
101103

python/python/geoindex_rs/kdtree.pyi

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,22 @@ def range(
3434
3535
Results are the insertion indexes of items that match the query.
3636
37+
**Example:**
38+
39+
```py
40+
import numpy as np
41+
from geoindex_rs import kdtree as kd
42+
43+
builder = kd.KDTreeBuilder(3)
44+
x = np.arange(0, 3)
45+
y = np.arange(2, 5)
46+
builder.add(x, y)
47+
tree = builder.finish()
48+
49+
results = kd.range(tree, 2, 4, 7, 9)
50+
assert results.to_pylist() == [2]
51+
```
52+
3753
Args:
3854
index: the KDTree to search.
3955
min_x: The `min_x` coordinate of the query bounding box.
@@ -79,6 +95,9 @@ class KDTreeBuilder:
7995
y = np.arange(2, 5)
8096
builder.add(x, y)
8197
tree = builder.finish()
98+
99+
results = kd.range(tree, 2, 4, 7, 9)
100+
assert results.to_pylist() == [2]
82101
```
83102
"""
84103
def __init__(
@@ -127,6 +146,9 @@ class KDTreeBuilder:
127146
128147
Returns:
129148
An Arrow array with the insertion index of each element, which provides a lookup back into the original data.
149+
150+
This can be converted to a [`pyarrow.Array`][] by passing to
151+
[`pyarrow.array`][].
130152
"""
131153
def finish(self) -> KDTree:
132154
"""Sort the internal index and convert this class to a KDTree instance.

python/python/geoindex_rs/rtree.pyi

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -365,6 +365,9 @@ class RTreeBuilder:
365365
366366
Returns:
367367
An Arrow array with the insertion index of each element, which provides a lookup back into the original data.
368+
369+
This can be converted to a [`pyarrow.Array`][] by passing to
370+
[`pyarrow.array`][].
368371
"""
369372
def finish(self, method: Literal["hilbert", "str"] = "hilbert") -> RTree:
370373
"""Sort the internal index and convert this class to an RTree instance.
@@ -392,6 +395,31 @@ class RTree(Buffer):
392395
This class implements the Python buffer protocol, so you can pass it to the Python
393396
`bytes` constructor to copy the underlying binary memory into a Python `bytes`
394397
object.
398+
399+
```py
400+
import numpy as np
401+
from geoindex_rs import rtree as rt
402+
403+
min_x = np.arange(0, 3)
404+
min_y = np.arange(0, 3)
405+
max_x = np.arange(2, 5)
406+
max_y = np.arange(2, 5)
407+
408+
builder = rt.RTreeBuilder(num_items=3)
409+
builder.add(min_x, min_y, max_x, max_y)
410+
tree = builder.finish()
411+
412+
# Copy to a Python bytes object
413+
copied_tree = bytes(tree)
414+
415+
# The entire RTree is contained within this 144 byte buffer
416+
assert len(copied_tree) == 144
417+
418+
# We can use the bytes object (or anything else implementing the Python buffer
419+
protocol) directly in searches
420+
results = rt.neighbors(copied_tree, 5, 5)
421+
assert results.to_pylist() == [2, 1, 0]
422+
```
395423
"""
396424
def __repr__(self) -> str: ...
397425
@property

0 commit comments

Comments
 (0)