-
Notifications
You must be signed in to change notification settings - Fork 440
fix(ackermann): handle NaN/Inf values in odometry update #2083
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix(ackermann): handle NaN/Inf values in odometry update #2083
Conversation
saikishor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not implement the protection logic in the original methods instead of new ones?
|
Thank you for the review! I originally created new methods to be conservative and ensure I didn't break existing behavior/ABI. However, I agree that hardening the original methods is a much cleaner approach. I will update the PR with those changes shortly. |
|
I have refactored the code based on the review. Hardened Existing Methods: I moved the NaN/Inf checks inside the original update_from_position and update_from_velocity methods as requested. They now return false if invalid data is detected, keeping the API clean. Open Loop Handling: The original update_open_loop returns void, so changing it to bool to propagate errors would break ABI compatibility. To solve this:
|
| if(!odometry_.try_update_open_loop( | ||
| last_linear_velocity_, last_angular_velocity_, period.seconds() | ||
| ))return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apply pre-commit changes
|
|
||
| void reset_odometry(); | ||
|
|
||
| bool try_update_open_loop(double linear, double angular, double delTime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about the naming as we will always have a successful update call, but it may or maynot fail, so not sure about the naming here
@christophfroehlich what do you think about adding this new method? or we simply change the void return type to bool and have it all¿+?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saikishor Regarding the naming and the new method(which is redundant, yes):
I introduced try_update_open_loop (which returns bool) alongside the existing update_open_loop (returning void) specifically to preserve ABI compatibility. Changing the return type of the existing function from void to bool would break binary compatibility for any downstream users linking against steering_controllers_library.
However, if you and @christophfroehlich prefer a cleaner API and are okay with breaking ABI for this release, I am happy to remove the try_ wrapper and modify the original update_open_loop to return bool directly. Just let me know your preference.
christophfroehlich
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the problem as described in the initial post. All implementations have the pattern
if (isfinite(state))
odometry.update()
Where exactly is the odometry updated with infinite values?
@christophfroehlich I can point to the exact location. You are correct that the sensor update methods guard against this, but the Open Loop command path does not. Here is the execution chain in the current master branch that leads to the issue:
Because of this chain, a single NaN in the command topic (when in open-loop mode) bypasses all checks and permanently destroys the odometry state. |
|
In open_loop mode, there is no sensor involved. Even more, see #2087, open_loop was broken and never updated without my fix. But I see the issue now if the reference interfaces are NaN. Please rebase your changes on top of mine. Please also add appropriate tests to steering_controllers_library, where it should be checked if the odometry is updated in open_loop mode, including the case of a timeout. |
@christophfroehlich thanks for the review and for the fix #2087 ; yes, I think this way, it could be way simpler. Working on this now:
|
This adds 'try_update' methods to SteeringKinematics and SteeringOdometry to safely handle cases where sensors return NaN or Infinite values. Previously, invalid sensor data caused the update loop to compute invalid odometry, potentially destabilizing the controller. Updates: - Added try_update_from_position/velocity to library - Updated Ackermann controller to use safe update methods Signed-off-by: Ishan1923 <[email protected]>
- Refactored update_from_position/velocity to handle NaN checks internally per review. - Added try_update_open_loop to allow error prpagation without breaking ABI. - Updated Ackermann controller to use safe library methods.
- Added isfinite check in update_and_write_commands to protect last_linear_velocity_ - Added regression test confirming robot stops safely on NaN input Signed-off-by: Ishan1923 <[email protected]>
8382dc0 to
c1b2281
Compare
|
@christophfroehlich I have updated the PR based on your feedback. Changes in this revision:
Ready for re-review! |
- Reset last_linear_velocity_ and last_angular_velocity_ to 0.0 when reference interfaces are NaN (indicating a timeout) in update_and_write_commands. - Previously, the controller persisted the last valid velocity during a timeout in open-loop mode, causing unsafe behavior. - Added test_open_loop_update_timeout to verify that velocities reset to zero upon timeout. - Added necessary FRIEND_TEST macro to test_steering_controllers_library.hpp to enable the test case.
|
Hi @christophfroehlich , I have updated the PR with the requested changes.
if (std::isfinite(reference_interfaces_[0]))
{
last_linear_velocity_ = reference_interfaces_[0];
}
else{
last_linear_velocity_ = 0.0;
}
if (std::isfinite(reference_interfaces_[1]))
{
last_angular_velocity_ = reference_interfaces_[1];
}
else{
last_angular_velocity_ = 0.0;
}
update_odometry(period);
|
christophfroehlich
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please install pre-commit in your workspace, and activate it for this repository (pre-commit install). To fix the failing tests now, simply run pre-commit run --all
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #2083 +/- ##
==========================================
+ Coverage 84.79% 84.84% +0.04%
==========================================
Files 151 151
Lines 14607 14654 +47
Branches 1266 1268 +2
==========================================
+ Hits 12386 12433 +47
Misses 1763 1763
Partials 458 458
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Signed-off-by: Ishan1923 <[email protected]>
0336579 to
2804dd6
Compare
|
Pushed the style fixes via pre-commit. The linting checks should pass now. Ready for review! |
|
Please don't do force pushes if a PR is already under review, as this just complicates new reviews. |
|
Apologies for the force push. I wasn't aware that it disrupts the review history. I wanted to keep the commit log clean after the pre-commit fixes, but I will stick to regular commits/pushes for future updates to preserve the diff context. I again genuinely apologize. |
Signed-off-by: Ishan1923 <[email protected]>
|
I have added the missing header to fix the strict build compilation error on Linux. I verified locally that the strict build passes now. Regarding the other failure: I noticed it failed with an Authentication required error (Invoke-WebRequest : Authentication required) during a download, which I don't see is related to my changes. Could you please re-trigger the workflows? |
Failing windows build is known and not related. |
There is no need for a linear history, as we use the "squash" strategy when PRs get merged. |
Signed-off-by: Ishan1923 <[email protected]>
|
Thanks for the clarification on the force-push policy and the Windows build flake. I noticed the pre-commit check failed on the previous push (likely due to header sorting), so I've pushed a quick style fix. Everything should be green now! |
Description
This PR addresses a safety issue where invalid sensor data (NaN or Infinity) could corrupt the odometry state in steering controllers.
Problem:
Previously, if a sensor returned
NaNorInf(e.g., due to a hardware fault or connection glitch), theSteeringOdometryandSteeringKinematicsupdate loops would perform calculations with these invalid values. This caused the robot's odometry to become invalid (NaN), potentially destabilizing the control loop or causing unpredictable behavior.Solution:
try_update_from_positionandtry_update_from_velocitymethods toSteeringKinematics(and the legacySteeringOdometrywrapper). These methods perform a validity check on the inputs before updating the internal state.AckermannSteeringControllerto use these new safe update methods. If the update fails (returnsfalse), the controller logs a warning and skips the odometry update for that cycle, preserving the last known valid state.Testing
colcon build.colcon test.pre-commit run.Checklist
colcon testandpre-commit run).