Skip to content

Add fps CUDA kernel#588

Merged
akihironitta merged 14 commits intomasterfrom
port-cluster-04-fps-cuda
Mar 23, 2026
Merged

Add fps CUDA kernel#588
akihironitta merged 14 commits intomasterfrom
port-cluster-04-fps-cuda

Conversation

@akihironitta
Copy link
Member

No description provided.

Port grid voxelization from pytorch_cluster into pyg-lib.
Each point is assigned a 1D cluster index based on its quantized
voxel position: floor((pos - start) / size), then flattened
with cumulative voxel counts.
Port CUDA grid voxelization kernel. One thread per point computes
the flattened voxel index. Tests verify CPU/CUDA parity.
Port greedy farthest point sampling from pytorch_cluster.
Uses at::parallel_for over batch dimension. Iteratively selects
the point farthest from the already-selected set.
Port CUDA farthest point sampling kernel with shared-memory argmax
reduction. Uses explicit scalar_gt/scalar_lt/scalar_min helpers to
avoid NVCC operator overload ambiguity with c10::SymInt.
@akihironitta akihironitta force-pushed the port-cluster-03-fps-dispatch branch from b5dd5d5 to a9f95f8 Compare March 16, 2026 07:37
@akihironitta akihironitta force-pushed the port-cluster-04-fps-cuda branch from 0190d9e to 6ab0a1f Compare March 16, 2026 07:37
@github-actions github-actions bot removed the ci label Mar 16, 2026
Base automatically changed from port-cluster-03-fps-dispatch to master March 23, 2026 01:01
@akihironitta akihironitta merged commit a30c22d into master Mar 23, 2026
26 of 27 checks passed
@akihironitta akihironitta deleted the port-cluster-04-fps-cuda branch March 23, 2026 04:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant