Merged
Conversation
Implements D8 flow direction using ESRI power-of-2 encoding (compatible with GDAL/ArcGIS) across all four backends (numpy, cupy, dask+numpy, dask+cupy). Follows the slope.py pattern with explicit cellsize- parameterized backend wrappers. New files: - xrspatial/flow_direction.py — CPU kernel (@ngjit), GPU device/global kernels, four backend wrappers, public API with @supports_dataset - xrspatial/tests/test_flow_direction.py — 66 tests (flat, cardinal, diagonal, known bowl, NaN handling, cross-backend, boundary modes, dtype acceptance, dataset support, cellsize effect, valid codes) - benchmarks/benchmarks/flow_direction.py — ASV benchmark Modified files: - xrspatial/__init__.py — export flow_direction - xrspatial/accessor.py — add Hydrology section to both accessors - README.md — add Hydrology section with Flow Direction entry
Extract basin delineation from watershed.py into its own basin.py module with a basin() public function. The old basins() stays as a backward-compatible wrapper. Add basin to __init__.py and both xarray accessors. Add BoundarySnapshot to _boundary_store.py -- a read-only in-memory copy of converged boundary strips. BoundaryStore.snapshot() copies strip data to plain numpy arrays and closes the underlying memmap temp files. Apply this pattern across all hydrology dask backends (flow_accumulation, watershed, fill, stream_order) so temp directories are cleaned up before the lazy dask result is returned. Previously, BoundaryStore objects captured in map_blocks closures kept temp files alive indefinitely.
Assigns unique position-based IDs to each stream segment between junctions, headwaters, and outlets. Supports numpy, cupy, dask+numpy, and dask+cupy backends.
Moves each pour point to the highest flow-accumulation cell within a circular search radius so that watershed delineation starts from the actual drainage channel. Dask backend extracts sparse pour points chunk-by-chunk (map_blocks flag pass + selective load) to keep memory bounded regardless of grid size.
…ptive cache Replace three memory/performance bottlenecks in _flow_path_dask: - Path tracing (phase 3): growable numpy buffers (~24 bytes/cell) instead of list-of-tuples (~164 bytes/cell) - Output assembly (phase 4): pre-group cells by chunk via vectorized searchsorted + stable argsort, then O(1) dict lookup per block instead of O(n_chunks * n_cells) linear scan - LRU cache: adaptive sizing capped at ~512 MB instead of fixed 32 entries regardless of chunk size
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
slope.pypatternTest plan
pytest xrspatial/tests/test_flow_direction.py -v— all 66 tests pass (including GPU)pytest xrspatial/tests/test_slope.py xrspatial/tests/test_terrain_metrics.py -v— all 220 existing tests still passpython -c "from xrspatial import flow_direction"— import worksagg.xrs.flow_direction()— accessor works on both DataArray and Dataset