Confidence implemented, with tests and confidence submodule #11

arijit-dasgupta · 2025-11-18T05:54:01Z

(Completes MET-29)

My apologies for this being a relatively big PR (not as bite-sized as I would have liked). It is my first attempt at implementing one of these components, and it took me several hours to understand how everything could pieced together. It has:

Rearranged the bindings into a separate folder genmetaballs/src/cuda/bindings where each file can be the bindings for a specific submodule and main.cu is the main binding file (was previously genmetaballs/src/cuda/bindings.cu.
Implementation of the different Confidence structs in genmetaballs/src/cuda/core/confidence.cuh (0,3 and 5 params) along with a templated CUDA kernel launch at computing this confidence score for an array of sumexpd_vec, which is implemented as gpu_get_confidence. I don't expect that we will need to use something like gpu_get_confidence in the forward render, but this was just to test that it runs in CUDA.
Implemented a confidence submodule binding in genmetaballs/src/cuda/bindings/confidence.cu which has:
3.1. Has an init_confidence_submodule binding using the NB_MODULE macro, that registers the difference Confidence structs and also registers gpu_get_confidence.
3.2. A type dispatcher called dispatch_confidence which does the type dispatch in C++ calls (like in gpu_get_confidence).
Testing for the confidence implementation in tests/python_tests/test_confidence.py which also checks that the confidence values are always between 0 and 1.
Adding a core submodule to the python genmetaballs package which just calls onto the submodule bindings
Sigmoid C++ code in genmetaballs/src/cuda/core/math_utils.cuh, a new file for math utils.

linear · 2025-11-18T05:54:05Z

MET-29 Implement `Confidence`

horizon-blue

Thank you for taking the lead here! :D The math looks good to me, and I left some inline comments on some minor issues. I also feel like we might want to move more things to the .cu files (not the ones in the bindings/ directory), and keep the headers themselves minimal. Though it's been a while since the last time I worked on a large cpp project, so I'd let @mugamma make the final call. We can also chat a bit about this tmr. Thanks again for the great (and speedy) work!

genmetaballs/src/genmetaballs/core/__init__.py

genmetaballs/src/cuda/bindings/math_utils.cu

genmetaballs/src/cuda/core/math_utils.cuh

horizon-blue · 2025-11-18T07:25:34Z

genmetaballs/src/cuda/core/confidence.cuh

+using ZeroParameterConfidence = ThreeParameterConfidence;
+
+// Generic CUDA kernel for computing confidence values
+template <typename Confidence>


@mugamma might know more about this but my instinct is that we probably shouldn't have kernel definition like these in the header files. Actually, whenever possible, it'd be a good idea to move the implementation of the structs to the corresponding .cu file as well. We can chat about file organization together in our meeting tmr as well :)

Sounds good to me, i know very little about the right way to organize here. Will implement whatever suggestions you both have :)

I moved the kernel over to a .cu file in core (from the header) in 167be1e

For now I am keeping the full struct definitions in the .cuh file, i'm concerned that nanobind can't pick it up without the full definition in the header. But happy to tale a suggestion here!

Co-authored-by: Xiaoyan Wang <[email protected]>

arijit-dasgupta · 2025-11-19T00:53:55Z

Implemented all the changes that fixes MET-44 @horizon-blue @mugamma. They are the following:

Single bindings file
Single utils file
Implement GPU tests for sigmoid (vectorized)
Implement GPU tests for confidence
Use the CUDA_CALLABLE macro
Chill down on the sigmoid tests
Rename confidence classes (reflect the actl num params) + remove alias as a result
Remove dispatch for C++ confidence python
remove binding for get_gpu_confidence
remove binding for sigmoid vector

mugamma

I don't want to keep you from merging your PR, so I've approved, but I think there is still too much in here.

I actually don't think the kernel launch tests for the confidence functions should test anything other than the fact that we can call these functions inside a kernel. Anything else those tests are measuring can be removed.

The correctness tests in C++ and python are also redundant. We can just keep the python ones against a ground-truth that we didn't implement, preferably the methods in the original FMB code.

Overall, a clean set of tests would be:

C++ smoke tests for calling the functions inside kernels.
Python tests for correctness of utilities (e.g. sigmoid) against reference scipy implementations.
Python tests for correctness of confidence methods against reference implementations in the FMB library.

mugamma · 2025-11-19T01:19:05Z

genmetaballs/src/cuda/core/confidence.cuh


 #include <cmath>
+#include <cuda_runtime.h>
+#include <vector>


Why do we need vector?

mugamma · 2025-11-19T01:20:15Z

genmetaballs/src/cuda/core/utils.cuh

 void cuda_check(cudaError_t code, const char* file, int line);

+CUDA_CALLABLE __forceinline__ float sigmoid(float x) {
+    if (isnan(x)) {


Is this check necessary? wouldn't the other branch already be a nan if the input is nan?

mugamma · 2025-11-19T01:22:53Z

tests/cpp_tests/test_confidence.cu

All the "python" references in the comments here just mean "ground truth" right? No python.

mugamma · 2025-11-19T01:24:20Z

tests/cpp_tests/test_utils.cu

I don't know if this is necessary. This is essentially a smoke test?

mugamma · 2025-11-19T01:27:06Z

tests/cpp_tests/test_confidence.cu

I also don't know what I feel about the testing pattern in this file. Essentially we are replicating the implementation and testing the implementation in the codebase against re-implementation here. In a sense the implementation here is the gold standard?

I was actually gonna ask you about this too lol, but had to go for my meeting earlier

arijit-dasgupta · 2025-11-19T02:47:27Z

@mugamma agreed with everything. It felt weird to do a correctness test in C++ by literally reimplementing the same thing for CPU. The smoke test thing is a good idea.

But for FMB, the codebase is shit. none of these pieces are broken up into functions. What do we do then? Should I write a GT function (like I already did in python) and add a permalink to where I think the FMB author had this?

I'll make all these changes before merging, don't worry. Good catches.

arijit-dasgupta · 2025-11-19T02:51:15Z

@mugamma agreed with everything. It felt weird to do a correctness test in C++ by literally reimplementing the same thing for CPU. The smoke test thing is a good idea.

But for FMB, the codebase is shit. none of these pieces are broken up into functions. What do we do then? Should I write a GT function (like I already did in python) and add a permalink to where I think the FMB author had this?

I'll make all these changes before merging, don't worry. Good catches.

We should totally only merge if we all like the structure. If not, it'll set a bad precedent for later. Doing all the corrections in this branch so that it is easier for later.

horizon-blue

Looks great to me overall minus the stuffs that Matin has pointed out. Thanks for the update!

I personally don't mind if we are having more tests than necessary as long as they won't slow down the overall test suite by a lot, and it seems like they aren't, so it's fine to me if you want to keep them there.

Feel free to merge the PR when you feel ready to keep us moving forward :). We can always address the minor concerns in follow-up PRs.

horizon-blue · 2025-11-20T03:21:57Z

genmetaballs/src/cuda/core/confidence.cuh

+    CUDA_CALLABLE __forceinline__ float get_confidence(float sumexpd) const {
+        return 1.0f - expf(-sumexpd);
+    }
+};


It's generally a good practice to end text files with a new line. Otherwise, future diffs will show not just the new code, but also and a "change" to the previous last line (which is why GitHub shows a warning sign here). How do you feel about adding a InsertNewlineAtEOF: true to our current .clang-format to get this covered?

I like this EOF idea for clang-format. Should we make a PR for that?

Oh wait, you already added it in 1424f94

nice!

horizon-blue · 2025-11-20T03:28:14Z

genmetaballs/src/cuda/bindings.cu

 #include <cstdint>
 #include <nanobind/nanobind.h>
 #include <nanobind/stl/vector.h>
+#include <stdexcept>


Is this being used by anything? 👀

horizon-blue · 2025-11-20T03:29:05Z

genmetaballs/src/cuda/core/forward.cu

@@ -1,12 +1,13 @@
 #include <cstdint>
 #include <cuda_runtime.h>
+#include <vector>


3param confidence, implemented, with tests and confidence submodule

1a6c9a7

arijit-dasgupta added 2 commits November 18, 2025 06:17

sigmoid tested, even in CUDA

b84cfdb

completed confidence tests for all structs, ready for review

b51f53c

arijit-dasgupta marked this pull request as ready for review November 18, 2025 06:24

arijit-dasgupta requested review from horizon-blue and mugamma November 18, 2025 06:25

horizon-blue requested changes Nov 18, 2025

View reviewed changes

arijit-dasgupta and others added 8 commits November 18, 2025 13:14

Update genmetaballs python __init__

c90a66d

Co-authored-by: Xiaoyan Wang <[email protected]>

un-templated sigmoid, and removed lambdas in math utils bindings

72c1bf0

moved confidence kernel from .cuh to .cu file

167be1e

single bindings file, single utils file, removed GPU-callable bindings

e050619

cuda callable macro

bc97879

renamed confidence struct names to reflect actual param counts

544f0e6

utils gpu test now in cpp_tests

80a0306

confidence test moved to cpp

b835203

mugamma approved these changes Nov 19, 2025

View reviewed changes

horizon-blue approved these changes Nov 20, 2025

View reviewed changes

mugamma added 2 commits November 20, 2025 16:27

merge master

ae6c684

fix formatting

8b81b1d

mugamma merged commit de06966 into master Nov 20, 2025
1 check passed

mugamma deleted the arijit/implement-confidence branch November 20, 2025 16:42

linear bot mentioned this pull request Nov 20, 2025

MET-46: Cleanup Confidence Implementation #16

Merged

Confidence implemented, with tests and confidence submodule #11

Confidence implemented, with tests and confidence submodule #11

Uh oh!

Conversation

arijit-dasgupta commented Nov 18, 2025

Uh oh!

linear bot commented Nov 18, 2025

Uh oh!

horizon-blue left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arijit-dasgupta commented Nov 19, 2025

Uh oh!

mugamma left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arijit-dasgupta commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arijit-dasgupta commented Nov 19, 2025

Uh oh!

horizon-blue left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

arijit-dasgupta commented Nov 19, 2025 •

edited

Loading