Mysterious behaviors of code generated with symforce

Dear authors,

I am using symforce to generate some cost functions, and use them in an optimization problem solved by the  ceres solver and our custom solver.
The developing env is Ubuntu 22 + ROS2 Humble.

There are quite a few strange things I encountered. 
If you have experience with these, you may hint me how to fix them. 
I'd be happy to share my codebase with you if you are interested to take a look.
Meanwhile, I will take time to pinpoint the cause by unit tests as I find time.

The first warning sign is that the optimizer from ceres solver or our solver does not converge for the problem built with sym jacobians. For the sparse normal cholesky solver in ceres solver, the optimized parameters barely change from their initial values at all. Our solver can change these parameters torwards the reference value more or less. 
However, the same problem with my analytic Jacobians and my custom parameter types can be solved by the ceres solver and our custom solver with good convergence.

So I check the Jacobians of all cost functions with numeric Jacobians.
A. I found the numeric Jacobians sometimes do not agree with symbolic Jacobians. For a parameter, its numeric Jacobian in some cost functions differs very much from the value given by symforce. 
This problem occurs mostly with the rotation part of a pose.
However, the two Jacobians can agree well in some cases of the same type of the cost function.

B. In the same terminal, I got very different results when I ran the same program with absolute path and with relative path.
The case with relative path gives lots of wrong residuals containing inf values.
The case with absolute path looks fine in terms of residuals.
But both cases do not converge.
This erratic behavior has not been observed with my other program that solves the same opt problem.

Now I have doubts about several points.
C. Do we need EIGEN_MAKE_ALIGNED_OPERATOR_NEW in C++20 programs?
D. Since symforce may not generate code properly for if else branching, does it work with for loops which iterates an integer range?

E. Here is an example of a bad residual and its bad Jacobians: 
All values look normal except for the residual and Jacobians.
```
I20250310 11:23:52.678215 3228500 calib_cam_imu.cpp:294] Bad IMU res     -364.623     -32.5488     -160.616      -3782.2 3.26517e+132           -0 with interval 0.02
bgba 0 0 0 0 0 0 g_dir 0.0515762  0.844481 -0.533096 gMag 9.8387
I20250310 11:23:52.678242 3228500 calib_cam_imu.cpp:298] 0 pt 0.0211226 -0.681036 -0.731859  0.011227  0.279714  0.354241  0.485652
I20250310 11:23:52.678256 3228500 calib_cam_imu.cpp:298] 1 pt 0.0110132 -0.659783 -0.751188  0.016748  0.312391  0.461406   0.49024
I20250310 11:23:52.678267 3228500 calib_cam_imu.cpp:298] 2 pt 0.00372726  -0.649199   -0.76046 -0.0150726   0.283718   0.443901   0.397507
I20250310 11:23:52.678279 3228500 calib_cam_imu.cpp:298] 3 pt -0.00157705   -0.664078   -0.747651  -0.0041484    0.288184    0.446251    0.491123
I20250310 11:23:52.678293 3228500 calib_cam_imu.cpp:298] 4 pt 0.0144139 -0.666337 -0.745112 0.0244002  0.298574  0.468022  0.647042
I20250310 11:23:52.678305 3228500 calib_cam_imu.cpp:300] lambdas    0.994833   0.0210913  -0.0645651
   0.760286    0.314348   -0.158968
   0.153628    0.243853    0.290085
3.65713e-09 7.31427e-07 0.000109714
sqrtinfo 441.942 441.942 441.942 25.2538 25.2538 25.2538
meas -0.0464044   -0.16487 -0.0653767     0.6894    6.34896   -7.71682
I20250310 11:23:52.678337 3228500 calib_cam_imu.cpp:176] Checking ImuFactorType<5> Jacs
I20250310 11:23:52.678377 3228500 calib_cam_imu.cpp:194] Mismatch at x0
Numeric:      -377.282       8022.24       -7186.1       227.241             0             0             0
     -461.355      -14.1113      -4.43716      -407.789             0             0             0
      407.396       17.1896      -16.7583      -461.319             0             0             0
     -76313.8       6272.25       12840.7         62633         18276       1178.38       123.189
-1.31736e+131 -5.86641e+132 -4.93862e+132   4.0429e+131  2.20721e+129   9.9442e+130 -1.27868e+132
            0             0             0             0             0             0             0
Symbolic:      -372.854       7879.45       -7339.5       229.594             0             0             0
     -461.324      -15.1094      -5.50979      -407.773             0             0             0
      407.312       19.9029      -13.8425      -461.364             0             0             0
     -76005.9      -3655.86       2171.65       62796.7         18276       1178.38       123.189
-2.92515e+131 -6.82801e+131  6.31833e+131  3.18826e+131  2.20721e+129   9.9442e+130 -1.27868e+132
           -0             0             0             0             0            -0            -0
I20250310 11:23:52.678509 3228500 calib_cam_imu.cpp:194] Mismatch at x1
Numeric:       324.125      -7952.77       7261.59       -327.51             0             0             0
      449.458       9.62375       9.47219       420.907             0             0             0
     -420.578      -20.5008       16.5516         449.2             0             0             0
       -13799      -622.437       438.142       11399.1      -18269.1      -1177.94      -123.142
-5.30722e+130 -1.24864e+131  1.13727e+131  5.78905e+130 -2.20637e+129 -9.94044e+130   1.2782e+132
            0             0             0             0             0             0             0
Symbolic:       326.433      -8091.06        7104.1      -323.999             0             0             0
      449.474       8.65671       8.37117       420.932             0             0             0
     -420.622      -17.8722       19.5443       449.133             0             0             0
     -13800.3       -542.94       528.649       11397.1      -18269.1      -1177.94      -123.142
-5.30418e+130  -1.2662e+131  1.11727e+131   5.7936e+130 -2.20637e+129 -9.94044e+130   1.2782e+132
           -0             0             0             0            -0             0             0
I20250310 11:23:52.678639 3228500 calib_cam_imu.cpp:194] Mismatch at x2
Numeric:        41.587       -2145.6       1881.01      -20.0914             0             0             0
      122.185      0.227742      -1.28729       107.986             0             0             0
     -107.968      -2.47536      0.876618       122.181             0             0             0
      1865.15       38.2883      -16.4881      -1602.36        619854       39966.2       4178.09
  7.5049e+129  1.72582e+130 -1.50375e+130 -8.57357e+129    7.486e+130  3.37269e+132 -4.33681e+133
            0             0             0             0             0             0             0
Symbolic:       41.7251      -2169.65       1852.83      -20.6499             0             0             0
      122.186    0.00627178      -1.54671       107.981             0             0             0
     -107.963      -3.32168     -0.114752       122.161             0             0             0
      1865.08       50.4837       -2.2024      -1602.08        619854       39966.2       4178.09
 7.50347e+129  1.75105e+130  -1.4742e+130 -8.56769e+129    7.486e+130  3.37269e+132 -4.33681e+133
            0             0             0            -0             0            -0            -0
I20250310 11:23:52.678768 3228500 calib_cam_imu.cpp:194] Mismatch at x3
Numeric:       174.637         -3083       2810.17      -13.3411             0             0             0
       183.74        10.554      0.786954       152.243             0             0             0
     -152.122      -2.67192       7.61443       183.864             0             0             0
      25993.1       935.219      -1001.37      -22181.7       -619861      -39966.7      -4178.14
 9.78762e+130  2.39228e+131 -2.15266e+131 -1.15199e+131 -7.48608e+130 -3.37273e+132  4.33686e+133
            0             0             0             0             0             0             0
Symbolic:       174.552      -3118.78       2769.87      -13.5649             0             0             0
      183.727       4.89713       -5.5818       152.208             0             0             0
     -152.129      -5.62133       4.29378       183.846             0             0             0
      25993.3       1053.85      -867.806      -22180.9       -619861      -39966.7      -4178.14
 9.78836e+130  2.40823e+131 -2.13471e+131 -1.15186e+131 -7.48608e+130 -3.37273e+132  4.33686e+133
            0            -0             0             0            -0             0             0
I20250310 11:23:52.678897 3228500 calib_cam_imu.cpp:194] Mismatch at x4
Numeric:       16.0175       5248.18      -4672.49      -62.9201             0             0             0
     -309.033      -3.01554     -0.714374      -256.741             0             0             0
      256.721      -2.93402       1.68065      -309.052             0             0             0
     -27768.6      -145.714       162.152       23927.6             0             0             0
-1.14804e+131 -2.53403e+131  2.25935e+131  1.29644e+131             0             0             0
            0             0             0             0             0             0             0
Symbolic:       16.2595       5236.95      -4685.02      -62.5099             0             0             0
     -308.915      -8.46436      -6.80736      -256.542             0             0             0
      256.766      -5.02484     -0.657359      -308.975             0             0             0
     -27770.9      -39.1951       281.262       23923.7            -0            -0            -0
-1.14828e+131  -2.5206e+131  2.27434e+131  1.29599e+131            -0            -0             0
            0             0            -0             0            -0             0             0
I20250310 11:23:52.679016 3228500 calib_cam_imu.cpp:200] Mismatch at bgba
Numeric:      441.942            0            0            0            0            0
           0      25.2538            0            0            0            0
           0            0      25.2538            0            0            0
           0            0            0      25.2538            0            0
           0            0            0            0 1.76851e+129            0
           0            0            0            0            0            0
Symbolic:      441.942            0            0            0            0            0
           0      25.2538            0            0            0            0
           0            0      25.2538            0            0            0
           0            0            0      25.2538            0            0
           0            0            0            0 1.76851e+129            0
           0            0            0            0            0 2.42092e-322
I20250310 11:23:52.679114 3228500 calib_cam_imu.cpp:204] Mismatch at g_dir
Numeric:             0             0             0             0
            0             0             0             0
            0             0             0             0
    0.0453093      -1.12838       2.03588     -0.256618
 2.92155e+128 -1.76059e+127 -4.32413e+125 -1.11364e+127
            0             0             0             0
Symbolic:             0             0             0             0
            0             0             0             0
            0             0             0             0
    -0.150279      -1.11644       2.03588      -0.14848
 6.35134e+127 -3.63999e+126 -4.32377e+125  1.15278e+128
            0             0             0             0
```
Updates: 
I think B and E are caused by one dependent program is compiled with -march=native whereas the target program is compiled without this flag. After removing this flag for the dependent lib, E disappears.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mysterious behaviors of code generated with symforce #430

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Mysterious behaviors of code generated with symforce #430

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions