-
Notifications
You must be signed in to change notification settings - Fork 177
Description
I have a question that has puzzled me for some time. From both the original implementation and the MMDetection3D version, it seems that the point-based branch does not process the raw point cloud at full resolution. In both cases, the input point cloud undergoes a quantization step before being fed into SPVConv, e.g.:
pc_ = floor(raw_pc)
_, inds, inverse_map = sparse_quantize(pc_, return_index=True, return_inverse=True)
Input_pc = pc_[inds]
After this, SPVConv, with a voxel-based convolution and a high-resolution point-based MLP, operates on Input_pc. Given this, it seems that the “point-based” branch in practice operates on voxel representations (Input_pc) rather than the raw point cloud resolution (raw_pc), which appears slightly inconsistent with the description in the paper "Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution" Section 3.3 where SPVConv is described as operating on raw points (point cloud tensor T ) before floor and quantization.
I wondered whether the “high-resolution point cloud” mentioned in the paper refers to the quantized points (voxels) after sparse quantization, or if I misunderstood some part of the pipeline.