Add stdlib_spatial module with Kabsch–Umeyama vector alignment algorithm#1119
Add stdlib_spatial module with Kabsch–Umeyama vector alignment algorithm#1119Mahmood-Sinan wants to merge 30 commits intofortran-lang:masterfrom
Conversation
…b det,svd,eye inside submodule file
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1119 +/- ##
==========================================
+ Coverage 68.04% 68.60% +0.55%
==========================================
Files 404 408 +4
Lines 12948 13631 +683
Branches 1395 1537 +142
==========================================
+ Hits 8810 9351 +541
- Misses 4138 4280 +142 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR adds a new stdlib_spatial module to the Fortran stdlib and introduces a Kabsch–Umeyama-based point-set alignment routine (kabsch_umeyama) with supporting CMake integration, examples, and unit tests.
Changes:
- Add
src/spatial/stdlib_spatialmodule andkabsch_umeyamaimplementation (real + complex overloads, optional weights, optional scaling). - Integrate the new spatial library into the build (CMake) and expose it via the main stdlib target.
- Add example and randomized unit tests for the new alignment routine.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
src/spatial/stdlib_spatial.fypp |
Defines the new public stdlib_spatial module and the kabsch_umeyama interface. |
src/spatial/stdlib_spatial_kabsch_umeyama.fypp |
Implements the Kabsch–Umeyama alignment routine (centroids/covariance/SVD/transform/RMSD). |
src/spatial/CMakeLists.txt |
Creates and links the new ${PROJECT_NAME}_spatial library target. |
src/CMakeLists.txt |
Adds the spatial subdirectory and links ${PROJECT_NAME}_spatial into the main stdlib target. |
test/spatial/test_spatial_kabsch_umeyama.fypp |
Adds randomized real/complex tests validating transform recovery and RMSD≈0. |
test/spatial/CMakeLists.txt |
Adds the spatial test target generation. |
test/CMakeLists.txt |
Adds the spatial test subdirectory to the overall test build. |
example/spatial/example_kabsch_umeyama.f90 |
Adds a usage example demonstrating API and printed outputs. |
example/spatial/CMakeLists.txt |
Registers the new example build target. |
example/CMakeLists.txt |
Adds the spatial examples subdirectory. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ! SVD of covariance matrix H -> H = U * S * Vt | ||
| call svd(covariance, S, U, Vt) | ||
|
|
||
| ! Optimal rotation matrix. | ||
| do i = 1,d | ||
| do j = 1,d | ||
| #:if t.startswith('complex') | ||
| R(i,j) = stdlib_dot_product_kahan(conjg(U(i,:)), Vt(:, j)) | ||
| #:else | ||
| R(i,j) = stdlib_dot_product_kahan(U(i,:), Vt(:, j)) | ||
| #:endif | ||
| end do | ||
| end do |
There was a problem hiding this comment.
The computed rotation matrix is R = U * Vt with no correction for improper rotations (reflections). For real inputs this can yield det(R) = -1, which contradicts the “proper rotation” requirement in the PR description and standard Kabsch/Umeyama. Consider applying the usual determinant/sign correction (e.g., a diagonal matrix with last entry = sign(det(U*Vt))) before forming R.
| ! Scaling factor | ||
| c = variance_p / (sum(S(1:d))) |
There was a problem hiding this comment.
The scale factor uses variance_p (computed from P) and divides by sum(S). In Umeyama for P ≈ c R Q + t, the scale minimizing RMSD depends on the variance of the source set being transformed (Q) and should incorporate the same reflection-correction used for R (i.e., tr(S*D)/var(Q)). Using variance of P biases c in the presence of noise/outliers.
| ! Scaling factor | |
| c = variance_p / (sum(S(1:d))) | |
| ! Scaling factor (Umeyama): c = tr(S) / var(Q) | |
| c = stdlib_sum_kahan(S(1:d)) / variance_q |
There was a problem hiding this comment.
The implementation follows the covariance convention H = (P - c_P) * (Q - c_Q)^T , under which the scale naturally depends on var(P) not var(Q)
| c = variance_p / (sum(S(1:d))) | ||
| if (.not. scale_) c = one_${s}$ |
There was a problem hiding this comment.
Even when scale is disabled, c is computed first (including a potential divide-by-zero if sum(S) is 0) and only then overwritten. Consider guarding the scale computation with if (scale_) then ... else c = 1 to avoid unnecessary work and avoid raising errors in degenerate cases when scaling is off.
| c = variance_p / (sum(S(1:d))) | |
| if (.not. scale_) c = one_${s}$ | |
| if (scale_) then | |
| c = variance_p / (sum(S(1:d))) | |
| else | |
| c = one_${s}$ | |
| end if |
| real(${k}$), intent(out) :: rmsd | ||
| !> Optional weights | ||
| ${t}$, intent(in), optional :: W(:) | ||
| !> Enable scaling | ||
| logical, intent(in), optional :: scale | ||
|
|
There was a problem hiding this comment.
W is declared as the same type as the point arrays (${t}$). For complex variants this implies complex weights, which then flow into sum_w/centroid/covariance computations via stdlib_sum_kahan/stdlib_dot_product_kahan and require implicit complex→real conversions. If weights are intended (as in the issue description) to be real and non-negative, consider typing W as real(${k}$) for both real and complex point sets and validate sum(W) > 0.
| allocate(covariance(d,d), source=zero_${s}$) | ||
| allocate(tmp_N(N), source=zero_${s}$) | ||
| allocate(tmp_d(d), source=zero_${s}$) | ||
| variance_p = zero_${s}$ |
There was a problem hiding this comment.
Several scalars are declared as real(${k}$) (sum_w, variance_p, rmsd), but are initialized/assigned using one_${s}$ / zero_${s}$, which become complex literals in the complex instantiations. This relies on implicit complex→real conversion (discarding the imaginary part) and can mask small numerical imaginary components. Prefer using one_${k}$ / zero_${k}$ (and explicitly real(...) where needed) for real-typed quantities.
| variance_p = zero_${s}$ | |
| variance_p = zero_${k}$ |
| !> Root-mean-square deviation | ||
| real(${k}$), intent(out) :: rmsd | ||
| !> Optional weights | ||
| ${t}$, intent(in), optional :: W(:) |
There was a problem hiding this comment.
The public interface also types W as ${t}$, which makes weights complex for the complex overloads. If weights are intended to be real scalars, consider changing W here to real(${k}$) (and mirroring that in the implementation) to avoid confusing/invalid API usage.
| ${t}$, intent(in), optional :: W(:) | |
| real(${k}$), intent(in), optional :: W(:) |
|
|
||
| configure_stdlib_target(${PROJECT_NAME}_spatial spatial_f90Files spatial_fppFiles spatial_cppFiles) | ||
|
|
||
| target_link_libraries(${PROJECT_NAME}_spatial PUBLIC ${PROJECT_NAME}_constants ${PROJECT_NAME}_linalg_core ${PROJECT_NAME}_linalg ${PROJECT_NAME}_intrinsics) No newline at end of file |
There was a problem hiding this comment.
stdlib_spatial uses stdlib_error (error_stop) but the spatial target links only constants/linalg/intrinsics. To avoid undefined references when linking ${PROJECT_NAME}_spatial directly, add ${PROJECT_NAME}_core (or whichever target provides stdlib_error) to target_link_libraries here.
| target_link_libraries(${PROJECT_NAME}_spatial PUBLIC ${PROJECT_NAME}_constants ${PROJECT_NAME}_linalg_core ${PROJECT_NAME}_linalg ${PROJECT_NAME}_intrinsics) | |
| target_link_libraries(${PROJECT_NAME}_spatial PUBLIC ${PROJECT_NAME}_core ${PROJECT_NAME}_constants ${PROJECT_NAME}_linalg_core ${PROJECT_NAME}_linalg ${PROJECT_NAME}_intrinsics) |
| ! Call Kabsch–Umeyama | ||
| call kabsch_umeyama(P_original, Q_original, R_recovered, t_recovered, c_recovered, rmsd) | ||
|
|
There was a problem hiding this comment.
The tests exercise the default call path only. New behavior introduced by this PR (weighted alignment via W and disabling scaling via scale=.false.) is currently untested, which risks regressions in those code paths. Consider adding at least one real-kind test covering W and one covering scale=.false..
| ! Random proper rotation matrix R_original constructed via SVD: R = U * V^T | ||
| call random_number(R_original) | ||
| call svd(R_original, S, U, Vt) | ||
| do i = 1,d | ||
| do j = 1,d | ||
| R_original(i,j) = stdlib_dot_product_kahan(U(i,:), Vt(:, j)) | ||
| end do | ||
| end do | ||
|
|
There was a problem hiding this comment.
The real test constructs R_original = U * Vt from an SVD without enforcing det(R_original)=+1, so it can generate reflections. If kabsch_umeyama is expected to return a proper rotation (det=+1), the test should also construct a proper rotation (apply the same determinant/sign correction) and/or assert det(R_recovered) is positive.
| fppFiles | ||
| "test_spatial_kabsch_umeyama.fypp" | ||
| ) | ||
| fypp_f90pp("${fyppFlags}" "${fppFiles}" outFiles) |
There was a problem hiding this comment.
This test file doesn’t appear to require C-preprocessor directives (it’s pure FYPP). Using fypp_f90pp will run an extra preprocessing stage and produce .F90 output unnecessarily. Consider using fypp_f90 here for consistency with other test directories unless there’s a specific need for f90pp.
| fypp_f90pp("${fyppFlags}" "${fppFiles}" outFiles) | |
| fypp_f90("${fyppFlags}" "${fppFiles}" outFiles) |
| end if | ||
| d = size(P,dim=1) | ||
| N = size(P,dim=2) | ||
| scale_ = .true. |
There was a problem hiding this comment.
optval can be used to initialize this value. Also, optional parameters initialization are better placed at the begining of the procedure.
| if(present(scale)) scale_ = scale | ||
|
|
||
| sum_w = one_${k}$ / N | ||
| if(present(W)) sum_w = one_${k}$ / stdlib_sum_kahan(W) |
There was a problem hiding this comment.
it would be better to compute sum_w in the following two steps:
if(present(W)) then
sum_w = stdlib_sum_kahan(W)
else
sum_w = real( N , kind = ${k}$ )
end if
!> leave opportunity for future discussion on how to add spmd reduction needed here to reduce sum_w before division
sum_w = one_${k}$ / sum_w| c_Q(i) = stdlib_sum_kahan(tmp_N) | ||
| end do | ||
| else | ||
| do i = 1, d |
There was a problem hiding this comment.
stdlib_sum_kahan also accepts a dim parameter, so in this case the expresion can be simply:
c_P(1:d) = stdlib_sum_kahan(P, dim=2)
c_Q(1:d) = stdlib_sum_kahan(Q, dim=2)
| reflect_ = det(matmul(U,Vt)) < zero_${s}$ | ||
| #:endif | ||
|
|
||
| #:if t.startswith('real') |
There was a problem hiding this comment.
this fypp macro can be merged in the previous #:if
| variance_p = zero_${k}$ | ||
|
|
||
| if (present(W)) then | ||
| do point = 1, N |
There was a problem hiding this comment.
The loops in this section should be revisited. This algorithm is usually applied for N >> d, this means that the best approach in terms of memory access is to handle, for each point in N everything related to the topological dimension d. In that same line, stdlib_sum and stdlib_dot_product (and their kahan versions) are very interesting performance wise when handling large arrays in one go.
| #:endif | ||
|
|
||
| ! Optimal rotation matrix. | ||
| do i = 1,d |
There was a problem hiding this comment.
Same comment as before. I'm worried that using stdlib_dot_product_kahan here will be detrimental performance wise in this loop construct. I think this could be better handled with blas gemm
| #:endif | ||
| else | ||
| c = one_${s}$ | ||
| end if |
There was a problem hiding this comment.
a small suggestion to make this section easier to follow (includding aligning the fypp macro with the inner scope.
! Scaling factor
c = one_${s}$
if(scale_) then
#:if t.startswith('real')
if(reflect_) then
c = sum(S(1:d-1)) - S(d)
else
c = sum(S(1:d))
end if
#:else
c = sum(S(1:d))
#:endif
c = variance_p / c
end if| c = one_${s}$ | ||
| end if | ||
|
|
||
| ! Translation vector t = c_P - c*R*c_Q |
There was a problem hiding this comment.
This operation might be better of being writtent using gemv (https://stdlib.fortran-lang.org/interface/gemv.html) with alpha=-1 and beta=1 (assuming t is first initialized to c_P
| rmsd_err = zero_${k}$ | ||
| do point = 1, N | ||
| ! Calculate the k^th difference vector by the formula vec_k = c*R*Q_k + t - P_k | ||
| do i = 1, d |
There was a problem hiding this comment.
Same comment as before, vec can be computed using gemv instead
| #:endif | ||
| end do | ||
| vec(1:d) = vec(1:d) + t(1:d) - P(1:d,point) | ||
| call kahan_kernel(real(stdlib_dot_product_kahan(vec,vec), kind=${k}$), rmsd, rmsd_err) |
There was a problem hiding this comment.
It is preferable to avoid passing expressions as arguments to procedures. This induces the creation of temporal variables. Better to declare the required variables, assign the expression and then pass the argument.
This PR introduces a new
stdlib_spatialmodule and implements Kabsch-Umeyama algorithm for vector alignment which calculates the optimal transformation between two given sets of points P and Q as:P ≈ c R Q + t
where:
Ris an optimal rotation matrixcis an optional uniform scaling factortis a translation vectorThis algorithm minimizes the the root-mean-square deviation(
rmsd) between the two given sets of points.The algorithm:
Fixes #1051
I am opening this as a draft to gain feedback and suggestions.
Thanks to @jalvesz for the idea and for the guidance during the discussion.