Skip to content

Conversation

@Ishan1923
Copy link

Description

This PR addresses a safety issue where invalid sensor data (NaN or Infinity) could corrupt the odometry state in steering controllers.

Problem:
Previously, if a sensor returned NaN or Inf (e.g., due to a hardware fault or connection glitch), the SteeringOdometry and SteeringKinematics update loops would perform calculations with these invalid values. This caused the robot's odometry to become invalid (NaN), potentially destabilizing the control loop or causing unpredictable behavior.

Solution:

  1. Library Update: Added try_update_from_position and try_update_from_velocity methods to SteeringKinematics (and the legacy SteeringOdometry wrapper). These methods perform a validity check on the inputs before updating the internal state.
  2. Controller Update: Updated AckermannSteeringController to use these new safe update methods. If the update fails (returns false), the controller logs a warning and skips the odometry update for that cycle, preserving the last known valid state.

Testing

  • Compiled successfully with colcon build.
  • Verified unit tests pass with colcon test.
  • Ran static analysis and formatting checks via pre-commit run.

Checklist

  • Fork the repository.
  • Modify the source; focus on the specific change.
  • Ensure local tests pass (colcon test and pre-commit run).
  • Commit to your fork using clear commit messages.
  • Send a pull request, answering any default questions.
  • Pay attention to automated CI failures.

Copy link
Member

@saikishor saikishor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not implement the protection logic in the original methods instead of new ones?

@Ishan1923
Copy link
Author

Thank you for the review! I originally created new methods to be conservative and ensure I didn't break existing behavior/ABI. However, I agree that hardening the original methods is a much cleaner approach. I will update the PR with those changes shortly.

@Ishan1923
Copy link
Author

Ishan1923 commented Dec 27, 2025

I have refactored the code based on the review.

Hardened Existing Methods: I moved the NaN/Inf checks inside the original update_from_position and update_from_velocity methods as requested. They now return false if invalid data is detected, keeping the API clean.

Open Loop Handling: The original update_open_loop returns void, so changing it to bool to propagate errors would break ABI compatibility. To solve this:

  • I added a silent safety check inside the original update_open_loop (it just returns early on failure).

  • I added a helper try_update_open_loop that returns bool, which allows the controller to detect failures explicitly without modifying the existing function signature.

@Ishan1923 Ishan1923 requested a review from saikishor December 29, 2025 06:55
Comment on lines 62 to 64
if(!odometry_.try_update_open_loop(
last_linear_velocity_, last_angular_velocity_, period.seconds()
))return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apply pre-commit changes


void reset_odometry();

bool try_update_open_loop(double linear, double angular, double delTime);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the naming as we will always have a successful update call, but it may or maynot fail, so not sure about the naming here

@christophfroehlich what do you think about adding this new method? or we simply change the void return type to bool and have it all¿+?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saikishor Regarding the naming and the new method(which is redundant, yes):
I introduced try_update_open_loop (which returns bool) alongside the existing update_open_loop (returning void) specifically to preserve ABI compatibility. Changing the return type of the existing function from void to bool would break binary compatibility for any downstream users linking against steering_controllers_library.

However, if you and @christophfroehlich prefer a cleaner API and are okay with breaking ABI for this release, I am happy to remove the try_ wrapper and modify the original update_open_loop to return bool directly. Just let me know your preference.

Copy link
Member

@christophfroehlich christophfroehlich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the problem as described in the initial post. All implementations have the pattern

if (isfinite(state))
  odometry.update()

Where exactly is the odometry updated with infinite values?

@Ishan1923
Copy link
Author

I don't see the problem as described in the initial post. All implementations have the pattern

if (isfinite(state))
  odometry.update()

Where exactly is the odometry updated with infinite values?

@christophfroehlich I can point to the exact location. You are correct that the sensor update methods guard against this, but the Open Loop command path does not.

Here is the execution chain in the current master branch that leads to the issue:

  1. In steering_kinematics.cpp, the integrate_fk method performs math directly on its inputs without checking them:
void SteeringKinematics::integrate_fk(const double v_bx, const double omega_bz, const double dt)
{
  const double delta_x_b = v_bx * dt; // <--- If v_bx is NaN, delta_x_b becomes NaN
  // ...
  // x_ += ... (Odometry state becomes permanently NaN here)
}
  1. The function that calls this math, update_open_loop, acts as a pass-through. It does not check isfinite() before calling the integration:
void SteeringKinematics::update_open_loop(const double v_bx, const double omega_bz, const double dt)
{
    / <--- MISSING: if (!isfinite(v_bx)) return;
  
     linear_ = v_bx;
     angular_ = omega_bz;
     integrate_fk(v_bx, omega_bz, dt); // <--- Passes NaN inputs directly to the math above
}
  1. In the controller's update loop, if open_loop is enabled, it passes the last command velocity directly to the library:
// (In AckermannSteeringController::update_odometry)
if (params_.open_loop) {
     // Uses command reference (which can be NaN from an upstream topic)
      odometry_.update_open_loop(last_linear_velocity_, last_angular_velocity_, period.seconds());
}

Because of this chain, a single NaN in the command topic (when in open-loop mode) bypasses all checks and permanently destroys the odometry state.
My PR guards this specific entry point.

@christophfroehlich
Copy link
Member

In open_loop mode, there is no sensor involved.

Even more, see #2087, open_loop was broken and never updated without my fix. But I see the issue now if the reference interfaces are NaN.

Please rebase your changes on top of mine.
This could be solved way simpler by just not updating the last*velocity variables if they are not finite, what do you think?

Please also add appropriate tests to steering_controllers_library, where it should be checked if the odometry is updated in open_loop mode, including the case of a timeout.

@Ishan1923
Copy link
Author

Ishan1923 commented Jan 1, 2026

In open_loop mode, there is no sensor involved.

Even more, see #2087, open_loop was broken and never updated without my fix. But I see the issue now if the reference interfaces are NaN.

Please rebase your changes on top of mine. This could be solved way simpler by just not updating the last*velocity variables if they are not finite, what do you think?

Please also add appropriate tests to steering_controllers_library, where it should be checked if the odometry is updated in open_loop mode, including the case of a timeout.

@christophfroehlich thanks for the review and for the fix #2087 ; yes, I think this way, it could be way simpler.

Working on this now:

  1. Rebase on top of your latest changes.
  2. Revert the library changes and implement the isfinite check directly in the controller when updating last_*velocity.
  3. Add a test case to verify the behavior with NaN inputs as requested.

This adds 'try_update' methods to SteeringKinematics and SteeringOdometry to safely handle cases where sensors return NaN or Infinite values.

Previously, invalid sensor data caused the update loop to compute invalid odometry, potentially destabilizing the controller.

Updates:
- Added try_update_from_position/velocity to library
- Updated Ackermann controller to use safe update methods

Signed-off-by: Ishan1923 <[email protected]>
- Refactored update_from_position/velocity to handle NaN checks internally per review.
- Added try_update_open_loop to allow error prpagation without breaking ABI.
- Updated Ackermann controller to use safe library methods.
- Added isfinite check in update_and_write_commands to protect last_linear_velocity_
- Added regression test confirming robot stops safely on NaN input

Signed-off-by: Ishan1923 <[email protected]>
@Ishan1923 Ishan1923 force-pushed the fix/ackermann-odometry-safety branch from 8382dc0 to c1b2281 Compare January 1, 2026 13:39
@Ishan1923
Copy link
Author

@christophfroehlich I have updated the PR based on your feedback.

Changes in this revision:

  1. Reverted all modifications to AckermannSteeringController, SteeringKinematics, and SteeringOdometry to rely on the upstream fixes in Fix open_loop odometry of steering controllers #2087.
  2. Implemented the isfinite check directly in steering_controllers_library.cpp (in update_and_write_commands) to prevent invalid values from overwriting the last known velocity.
  3. Updated the regression test to verify that the controller safely stops (issuing 0.0 commands) when NaN input is received, rather than persisting the last value.

Ready for re-review!

- Reset last_linear_velocity_ and last_angular_velocity_ to 0.0 when reference interfaces are NaN (indicating a timeout) in update_and_write_commands.
- Previously, the controller persisted the last valid velocity during a timeout in open-loop mode, causing unsafe behavior.
- Added test_open_loop_update_timeout to verify that velocities reset to zero upon timeout.
- Added necessary FRIEND_TEST macro to test_steering_controllers_library.hpp to enable the test case.
@Ishan1923
Copy link
Author

Hi @christophfroehlich ,

I have updated the PR with the requested changes.

  • Test Coverage: I added a new test case, test_open_loop_update_timeout, to steering_controllers_library.
  • This test successfully reproduced the issue: before the fix, the robot continued moving at the last valid velocity even after the timeout occurred.
  • The Fix: I updated update_and_write_commands to explicitly check for NaN values in the reference interfaces (which occur during a timeout). Now, if NaN is detected while in open-loop mode, the internal velocity states (last_linear_velocity_, etc.) are safely reset to 0.0.
  if (std::isfinite(reference_interfaces_[0]))
  {
    last_linear_velocity_ = reference_interfaces_[0];
  }
  else{
    last_linear_velocity_ = 0.0;
  }
  if (std::isfinite(reference_interfaces_[1]))
  {
    last_angular_velocity_ = reference_interfaces_[1];
  }
  else{
    last_angular_velocity_ = 0.0;
  }
  update_odometry(period);
  

Copy link
Member

@christophfroehlich christophfroehlich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please install pre-commit in your workspace, and activate it for this repository (pre-commit install). To fix the failing tests now, simply run pre-commit run --all

@codecov
Copy link

codecov bot commented Jan 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.84%. Comparing base (b6c1137) to head (aee344f).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2083      +/-   ##
==========================================
+ Coverage   84.79%   84.84%   +0.04%     
==========================================
  Files         151      151              
  Lines       14607    14654      +47     
  Branches     1266     1268       +2     
==========================================
+ Hits        12386    12433      +47     
  Misses       1763     1763              
  Partials      458      458              
Flag Coverage Δ
unittests 84.84% <100.00%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...llers_library/src/steering_controllers_library.cpp 69.42% <100.00%> (+0.44%) ⬆️
...library/test/test_steering_controllers_library.cpp 100.00% <100.00%> (ø)
...library/test/test_steering_controllers_library.hpp 97.43% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Ishan1923 Ishan1923 force-pushed the fix/ackermann-odometry-safety branch from 0336579 to 2804dd6 Compare January 7, 2026 12:20
@Ishan1923
Copy link
Author

Pushed the style fixes via pre-commit. The linting checks should pass now. Ready for review!

@christophfroehlich
Copy link
Member

Please don't do force pushes if a PR is already under review, as this just complicates new reviews.

@Ishan1923
Copy link
Author

Apologies for the force push. I wasn't aware that it disrupts the review history. I wanted to keep the commit log clean after the pre-commit fixes, but I will stick to regular commits/pushes for future updates to preserve the diff context. I again genuinely apologize.

@Ishan1923
Copy link
Author

Ishan1923 commented Jan 7, 2026

I have added the missing header to fix the strict build compilation error on Linux. I verified locally that the strict build passes now.

Regarding the other failure: I noticed it failed with an Authentication required error (Invoke-WebRequest : Authentication required) during a download, which I don't see is related to my changes.

Could you please re-trigger the workflows?

@christophfroehlich
Copy link
Member

Regarding the other failure: I noticed it failed with an Authentication required error (Invoke-WebRequest : Authentication required) during a download, which I don't see is related to my changes.

Failing windows build is known and not related.

@christophfroehlich
Copy link
Member

Apologies for the force push. I wasn't aware that it disrupts the review history. I wanted to keep the commit log clean after the pre-commit fixes, but I will stick to regular commits/pushes for future updates to preserve the diff context. I again genuinely apologize.

There is no need for a linear history, as we use the "squash" strategy when PRs get merged.

@Ishan1923
Copy link
Author

Thanks for the clarification on the force-push policy and the Windows build flake. I noticed the pre-commit check failed on the previous push (likely due to header sorting), so I've pushed a quick style fix. Everything should be green now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants