[WIP] feat(c/sedona-libgpuspatial) Add GPU-accelerated spatial joins #310

pwrliang · 2025-11-15T00:14:44Z

This PR adds GPU-accelerated spatial join support to SedonaDB using NVIDIA CUDA and the libgpuspatial library. GPU execution is automatically enabled when available and provides significant performance improvements for large-scale spatial joins.

Core Features

GPU Spatial Join Execution: Implemented GpuSpatialJoinExec physical plan that leverages CUDA for parallel spatial join operations
Auto-detection: GPU is automatically detected and enabled when building with --features gpu
Optimizer Integration: Spatial join optimizer automatically routes queries to GPU when enabled and hardware is available
CPU Fallback: Gracefully falls back to CPU execution when GPU is unavailable or encounters errors

Testing

Added SQL integration test test_gpu_spatial_join_sql with guaranteed-intersecting geometries
Test validates both ST_Intersects and ST_Contains predicates via SQL EXPLAIN and execution
Fixed optimizer schema validation to work correctly with GPU execution plans

Configuration

GPU can be enabled via: Build with --features gpu (auto-enables when hardware detected)

  # Disable GPU for entire session
  ctx.sql("SET sedona.spatial_join.gpu.enable = false")

  # Enable GPU for entire session
  ctx.sql("SET sedona.spatial_join.gpu.enable = true")

  # Check current setting
  result = ctx.sql("SHOW sedona.spatial_join.gpu.enable")
  result.show()

  # Method 4: Set other GPU options
  ctx.sql("SET sedona.spatial_join.gpu.min_rows_threshold = 100000")
  ctx.sql("SET sedona.spatial_join.gpu.device_id = 0")
  ctx.sql("SET sedona.spatial_join.gpu.fallback_to_cpu = true")

Testing

# Run GPU spatial join tests (requires CUDA-capable GPU)
cargo test --package sedona-spatial-join --features gpu test_gpu_spatial_join_sql -- --nocapture --ignored

# Build CLI with GPU support
cargo build --bin sedona-cli --features gpu --release

# Verify GPU execution via EXPLAIN
./target/release/sedona-cli -c "EXPLAIN SELECT * FROM polygons JOIN points ON ST_Intersects(polygons.geom, points.geom)"
# Should show: GpuSpatialJoinExec

…/gpu-spatial-join

jiayuasu · 2025-11-17T01:33:47Z

@pwrliang now the CI is working

paleolimbot

This is amazing! It has been very cool to watch this project evolve over the last six months and I know this represents a huge amount of work.

This is a large change and I wanted to leave a few high-level things to think about while you're polishing this up.

I see a bit of commented-out code in some of the files...feel free to file GitHub issues if that code represents a future piece of work that needs doing (and remove the commented-out code)!
I see CUDA-specific tests, which are great! If there are portions of the code that aren't well-covered by tests we do need to add them (we can open follow-on issues and do this in follow-up PRs too)
Because we're an Apache project we need the licensing and provenance of the files to be clear. I see some copyright notices from Nvidia...there's a place in LICENSE.md to acknowledge subdirectories where code was copied. We also need license headers on all the files (there's a script in scripts/ that can help do mass addition of the license header to a bunch of files at once).
It looks like you've done a great job ensuring the casual contributor doesn't have to deal with the GPU build complexity using default-members. That was one of my initial concerns but it looks great so far.

Give me a ping when you're ready for me to take a look!

pwrliang · 2025-11-20T02:19:32Z

This is amazing! It has been very cool to watch this project evolve over the last six months and I know this represents a huge amount of work.

This is a large change and I wanted to leave a few high-level things to think about while you're polishing this up.

I see a bit of commented-out code in some of the files...feel free to file GitHub issues if that code represents a future piece of work that needs doing (and remove the commented-out code)!

I see CUDA-specific tests, which are great! If there are portions of the code that aren't well-covered by tests we do need to add them (we can open follow-on issues and do this in follow-up PRs too)

Because we're an Apache project we need the licensing and provenance of the files to be clear. I see some copyright notices from Nvidia...there's a place in LICENSE.md to acknowledge subdirectories where code was copied. We also need license headers on all the files (there's a script in scripts/ that can help do mass addition of the license header to a bunch of files at once).

It looks like you've done a great job ensuring the casual contributor doesn't have to deal with the GPU build complexity using default-members. That was one of my initial concerns but it looks great so far.

Give me a ping when you're ready for me to take a look!

@paleolimbot Hi, Dewey, thanks for your attention. Currently, this PR has resolved the license issues, and can pass almost all of the jobs in the CI. I'd like to hear any suggestions from you. @zhangfengcdt has written the Rust part to hook up sedona-db to libgpuspatial, so the credits for building go to him.

zhangfengcdt · 2025-11-20T14:52:30Z

@pwrliang Thanks for opening this PR! As per our discussion, we could break down this large PR into smaller ones to make it manageable and reviewable. Ideally, (1)libgpuspatial (c++ and cuda) with tests (2) gpu spatial join module in rust (3) build pipelines and e2e tests.

Let me know if you can reduce this to only include (1) with the proper cleanup and apache license headers. We can continue (2) and (3) once this one merges. Thanks!

zhangfengcdt added 30 commits September 30, 2025 09:13

feat: Add GPU-accelerated spatial joins based on libgpuspatial C++/CUDA

39bcb05

Merge branch 'main' of github.com:zhangfengcdt/sedona-db into feature…

02a7623

…/gpu-spatial-join

fix fmt error

b4cddec

Merge branch 'main' of github.com:zhangfengcdt/sedona-db into feature…

b280b83

…/gpu-spatial-join

Add gpu config, execution structure, and feature flag

46de0ca

restructure

671cfba

implement stream

185ad5e

refactor to use simplied approach

d427683

implement tests

6f8d3c0

add tests and ci pipeline

966a896

disable running gpu tests

191bd85

add libgpuspatial source

a2ef437

test rust build

fbf2238

temporarily disable other workflows

f8234e6

wip

6afd6ef

fix rust-gpu build ci pipeline

205bbd5

fix ci build

50bf5aa

fix cmake

50b4885

fix ubuntu version

167571e

install from vcpkg.json

e464812

fix vcpkg not found issue

94bb0f0

add vkpkg.json

91beb9b

install gcc 10

5f5a8bd

use gcc for cuda compatibility

91457d9

gcc 10

869318e

fix build.rs

58af42e

force use gcc 10

4b219f5

use cuda 12.4 with gcc 11

2694d1d

add cuda repository before install cuda

7928564

cleanup disk space

939b5f6

pwrliang added 5 commits November 14, 2025 17:07

Log WKT parsing time

ce75dcf

Use pragma once and add licenses

0376b87

Fix include order

a376e87

Bugfix

8548017

Remove benchmark program

1314494

pwrliang changed the title ~~feat(c/sedona-libgpuspatial) Add GPU-accelerated spatial joins~~ [WIP] feat(c/sedona-libgpuspatial) Add GPU-accelerated spatial joins Nov 15, 2025

pwrliang added 4 commits November 16, 2025 10:59

Change log-level

71226e4

Merge upstream code

7a0e1ae

Fix tests

bb3e784

Restore yml changes

49d618e

pwrliang added 9 commits November 17, 2025 17:35

Calculate chunks according to free memory.

843d981

Debugging CI

2343e43

Merge branch 'main' of https://github.com/apache/sedona-db into gpu

8f9d4f8

Merge branch 'gpu' of github.com:pwrliang/sedona-db into gpu

a27de08

Add licenses

bd44d4b

Fix some license issues

a12379f

Fixes with pre-commit

5d820ef

Fix some lints

25320a0

Fix some lints

3d0ea67

paleolimbot reviewed Nov 18, 2025

View reviewed changes

pwrliang and others added 5 commits November 18, 2025 11:48

Remove commented out code

9f5d685

Remove commented out code

392b22b

Rewrite some code

58fcba3

Fix license issues

f147aba

Try to fix issues reported by clippy

42ebc68

pwrliang added 2 commits November 20, 2025 14:58

Merge branch 'main' of github.com:zhangfengcdt/sedona-db into gpu

4a67e45

Keep libgpuspatial only

b928ccf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] feat(c/sedona-libgpuspatial) Add GPU-accelerated spatial joins #310

[WIP] feat(c/sedona-libgpuspatial) Add GPU-accelerated spatial joins #310

pwrliang commented Nov 15, 2025

Uh oh!

jiayuasu commented Nov 17, 2025

Uh oh!

paleolimbot left a comment

Uh oh!

pwrliang commented Nov 20, 2025

Uh oh!

zhangfengcdt commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[WIP] feat(c/sedona-libgpuspatial) Add GPU-accelerated spatial joins #310

Are you sure you want to change the base?

[WIP] feat(c/sedona-libgpuspatial) Add GPU-accelerated spatial joins #310

Conversation

pwrliang commented Nov 15, 2025

Uh oh!

jiayuasu commented Nov 17, 2025

Uh oh!

paleolimbot left a comment

Choose a reason for hiding this comment

Uh oh!

pwrliang commented Nov 20, 2025

Uh oh!

zhangfengcdt commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants