Skip to content

Update Gaussian splatting camera API and world-space parity#518

Open
fwilliams wants to merge 24 commits intoopenvdb:mainfrom
fwilliams:fw/gaussian-camera-api
Open

Update Gaussian splatting camera API and world-space parity#518
fwilliams wants to merge 24 commits intoopenvdb:mainfrom
fwilliams:fw/gaussian-camera-api

Conversation

@fwilliams
Copy link
Copy Markdown
Collaborator

@fwilliams fwilliams commented Mar 7, 2026

Summary

This PR updates Gaussian splatting camera APIs to separate camera semantics from projection implementation choice, adds world-space render parity for depth and RGBD paths, and finishes the follow-up validation/test cleanup needed to make the new API consistent across C++, Python bindings, Python wrappers, and typing stubs.

What changed

  • replaced the old public ProjectionType / DistortionModel split with explicit CameraModel and ProjectionMethod controls
  • updated projected-state metadata so ProjectedGaussianSplats carries camera_model and projection_method
  • added render_depths_from_world(...) and render_images_and_depths_from_world(...) to match the existing world-space image render path
  • fixed dense and sparse antialias forwarding so antialias is honored consistently in all camera render entrypoints
  • tightened shared camera/projection validation and centralized it so analytic and unscented paths enforce the same contracts
  • removed an unnecessary heavy include from GaussianRenderSettings.h
  • fixed a stale script callsite that still used the old positional argument ordering
  • refreshed Python docs and tests to match the final public API contract

API design

The public Gaussian camera API now uses two independent axes:

  • CameraModel: what camera is being modeled
  • ProjectionMethod: how projection is evaluated

ProjectionMethod.AUTO remains the default dispatch behavior:

  • analytic projection for PINHOLE and ORTHOGRAPHIC
  • unscented projection for OpenCV-style distorted cameras

Distortion coefficients contract

This PR defines the intended contract for distortion_coeffs:

  • CameraModel.OPENCV_*: a contiguous tensor with shape (C, 12) is required
  • CameraModel.PINHOLE / CameraModel.ORTHOGRAPHIC: callers may pass None or a (C, 12) tensor; if a tensor is provided it is ignored
  • a zero-filled (C, 12) tensor represents "no distortion" for OpenCV camera models

Testing

Validated with:

  • conda run -n fvdb black --target-version=py311 --line-length=120 --extend-exclude='wip/' --check fvdb/gaussian_splatting.py tests/unit/test_gaussian_splat_3d.py src/tests/scripts/write_rasterize_forward_test_data.py
  • conda run -n fvdb ./build.sh install gtests
  • ctest --output-on-failure -R GaussianCamerasTest
  • ctest --output-on-failure -R GaussianSplat3dCameraApiTest
  • cd tests && conda run -n fvdb pytest unit/test_gaussian_splat_3d.py -k 'camera_api_validation_errors or pinhole_and_orthographic_ignore_distortion_coeffs_tensor or projection_method_resolution_and_metadata' -v
  • earlier branch validation also included cd tests && conda run -n fvdb pytest unit/test_gaussian_splat_3d.py -v

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates FVDB’s Gaussian splatting camera API to cleanly separate camera semantics (CameraModel) from projection implementation choice (ProjectionMethod), and extends world-space rendering to include depth-only and RGBD variants to match the existing world-space RGB path.

Changes:

  • Replaced the old projection/distortion API split with camera_model, projection_method, and distortion_coeffs across C++, pybind, Python wrappers, and typing stubs.
  • Added world-space parity APIs: render_depths_from_world(...) and render_images_and_depths_from_world(...).
  • Expanded/updated CUDA/C++ and Python test coverage for camera-model dispatch, metadata, and world-space depth/RGBD parity.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/unit/test_rasterize_from_world.py Updates test helper + adds world-space depth/RGBD parity and gradient tests across camera models.
tests/unit/test_gaussian_splat_3d.py Adds new camera API test suite covering dispatch/validation/parity and metadata fields.
src/tests/GaussianSplat3dCameraApiTest.cpp New C++ gtest coverage for camera API dispatch, validation, and world-space parity.
src/tests/GaussianCamerasTest.cu Adds distorted-perspective projection parity test and helpers for OpenCV distortion coefficients.
src/tests/CMakeLists.txt Registers the new GaussianSplat3dCameraApiTest.
src/python/GaussianSplatBinding.cpp Exposes CameraModel + ProjectionMethod and updates GaussianSplat3d bindings + new world-space depth/RGBD entry points.
src/fvdb/detail/ops/gsplat/GaussianCameras.cuh Introduces ProjectionMethod enum.
src/fvdb/GaussianSplat3d.h Updates public C++ API signatures + projected-state metadata, and adds world-space depth/RGBD entry points.
src/fvdb/GaussianSplat3d.cpp Implements shared camera arg validation + dispatch and adds world-space depth/RGBD implementations.
fvdb/gaussian_splatting.py Updates Python wrapper APIs, conversions, metadata accessors, and adds world-space depth/RGBD wrapper methods.
fvdb/enums.py Renames DistortionModelCameraModel and adds ProjectionMethod.
fvdb/_fvdb_cpp.pyi Updates stubs for new camera/projection API surface and adds world-space depth/RGBD bindings.
fvdb/init.pyi Re-exports CameraModel and ProjectionMethod in typing.
fvdb/init.py Re-exports CameraModel and ProjectionMethod at runtime.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (1)

src/tests/scripts/write_rasterize_forward_test_data.py:72

  • project_gaussians_for_images(...) call is now passing CameraModel.PINHOLE and sh_degree as positional arguments, but the Python API signature has inserted projection_method and distortion_coeffs between them. As written, sh_degree will be interpreted as projection_method, which will raise at runtime or select an unintended method. Pass projection_method/distortion_coeffs explicitly (or switch to keyword args for camera_model and sh_degree_to_use) to keep this script usable for regenerating test data.
projected_gaussians = gs3d.project_gaussians_for_images(
    cam_to_world_mats,
    projection_mats,
    width,
    height,
    near_plane,
    far_plane,
    CameraModel.PINHOLE,
    sh_degree,
    min_radius_2d=0.0,
    eps_2d=1e-4,
    antialias=True,
)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

src/tests/scripts/write_rasterize_forward_test_data.py:72

  • GaussianSplat3d.project_gaussians_for_images(...) signature now includes projection_method and distortion_coeffs immediately after camera_model. This call uses positional arguments beyond far_plane, so sh_degree will be interpreted as projection_method, shifting all subsequent parameters and likely breaking the generated test data. Switch to keyword arguments for camera_model, projection_method (e.g. AUTO), distortion_coeffs, and sh_degree_to_use (or pass the missing positional placeholders explicitly) to avoid accidental mis-ordering when the API evolves.
projected_gaussians = gs3d.project_gaussians_for_images(
    cam_to_world_mats,
    projection_mats,
    width,
    height,
    near_plane,
    far_plane,
    CameraModel.PINHOLE,
    sh_degree,
    min_radius_2d=0.0,
    eps_2d=1e-4,
    antialias=True,
)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Replace the public Gaussian camera API with CameraModel and ProjectionMethod so analytic and unscented projection paths share one coherent surface. Add the missing from-world depth/RGBD bindings and wrappers while keeping projection_matrices stable, then update typings and tests to cover the new behavior.

Signed-off-by: Francis Williams <francis@fwilliams.info>
Made-with: Cursor
Signed-off-by: Francis Williams <francis@fwilliams.info>
Signed-off-by: Francis Williams <francis@fwilliams.info>
Signed-off-by: Francis Williams <francis@fwilliams.info>
Signed-off-by: Francis Williams <francis@fwilliams.info>
Signed-off-by: Francis Williams <francis@fwilliams.info>
Signed-off-by: Francis Williams <francis@fwilliams.info>
Signed-off-by: Francis Williams <francis@fwilliams.info>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

fvdb/gaussian_splatting.py:2114

  • The render_images_from_world docstring says to use None for distortion_coeffs “for no distortion”, but the new camera validation requires distortion_coeffs to be provided for OpenCV camera models (it can be a zero-filled (C, 12) tensor). Update the docstring to reflect the actual contract: distortion_coeffs must be None for PINHOLE/ORTHOGRAPHIC and a (C,12) tensor for CameraModel.OPENCV_*.
                :attr:`fvdb.ProjectionMethod.AUTO`.
            distortion_coeffs (torch.Tensor | None): Distortion coefficients for OpenCV camera
                models. Use ``None`` for no distortion. Expected shape is ``(C, 12)`` with packed
                layout ``[k1,k2,k3,k4,k5,k6,p1,p2,s1,s2,s3,s4]``. For camera models that use fewer
                coefficients, unused entries should be set to 0.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Francis Williams <francis@fwilliams.info>
Signed-off-by: Francis Williams <francis@fwilliams.info>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Francis Williams <francis@fwilliams.info>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Francis Williams <francis@fwilliams.info>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Signed-off-by: Francis Williams <francis@fwilliams.info>
Signed-off-by: Francis Williams <francis@fwilliams.info>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Francis Williams <fwilliams@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

fvdb/_fvdb_cpp.pyi:269

  • The GaussianSplat3d.render_from_projected_gaussians signature in this stub file is missing the masks argument, but the pybind binding exposes masks (see src/python/GaussianSplatBinding.cpp where py::arg("masks") is defined). Please add masks: Optional[torch.Tensor] = ... to keep type hints consistent with the runtime API.
    def render_from_projected_gaussians(
        self,
        projected_gaussians: ProjectedGaussianSplats,
        crop_width: int = ...,
        crop_height: int = ...,
        crop_origin_w: int = ...,
        crop_origin_h: int = ...,
        tile_size: int = ...,
        backgrounds: Optional[torch.Tensor] = ...,
    ) -> tuple[torch.Tensor, torch.Tensor]: ...

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Apply clang-format include ordering in GaussianCamerasTest so the PR passes the C++ style check after merging from main.

Signed-off-by: Francis Williams <francis@fwilliams.info>
Signed-off-by: Francis Williams <francis@fwilliams.info>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@harrism
Copy link
Copy Markdown
Contributor

harrism commented Mar 18, 2026

CameraModel.PINHOLE / CameraModel.ORTHOGRAPHIC: callers may pass None or a (C, 12) tensor; if a tensor is provided it is ignored

Is there a reason not to require None for these two models? I think it's a cleaner API if there is only one option.

@fwilliams
Copy link
Copy Markdown
Collaborator Author

CameraModel.PINHOLE / CameraModel.ORTHOGRAPHIC: callers may pass None or a (C, 12) tensor; if a tensor is provided it is ignored

Is there a reason not to require None for these two models? I think it's a cleaner API if there is only one option.

Good question.

My line of thinking was that a user might have some code that automatically switches between different camera models. In this case, the distortion coefficients are likely just part of the dataset, but the camera model is a user choice.

If you enforce None for the distortion coefficients, you need to write:

def my_render_pipeline(camera_model: CameraModel, image_index: int, dataset: SfmDataset, splats: GaussianSplat3d):
   distortion_coeffs = None
   if camera_model not in (CameraModel.ORTHOGRAPHIC, CameraModel.PINHOLE):
     distortion_coeffs = dataset[image_index]["distortion_coeffs"]
   splats.render_images(..., camera_model, distortion_coeffs, ...)

If the constraint is not enforced, this code becomes:

def my_render_pipeline(camera_model: CameraModel, image_index: int, dataset: SfmDataset, splats: GaussianSplat3d):
   splats.render_images(..., camera_model, distortion_coeffs, ...)

Allowing None means if the dataset has no distortion, then you'll get an error when you render with distorted camera models.

Copy link
Copy Markdown
Contributor

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great. Tests are comprehensive. A couple of suggestions. I'm pre-approving but please consider my suggestions and question.

Copy link
Copy Markdown
Contributor

@swahtz swahtz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a couple of small suggestions and catches.

While we don't have the need right now, it gives me pause that the pinhole model assumes perspective projection and that in the future we might want to express other projections and sensor modeling like cylindrical or spherical projections where we might want to write more features and argument checking into a larger Camera struct.

Signed-off-by: Francis Williams <francis@fwilliams.info>
Signed-off-by: Francis Williams <francis@fwilliams.info>
Signed-off-by: Francis Williams <francis@fwilliams.info>
Signed-off-by: Francis Williams <francis@fwilliams.info>
Signed-off-by: Francis Williams <francis@fwilliams.info>
@swahtz swahtz added this to the v0.5 milestone Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants