Skip to content

Comments

Fix #765: Add error handling for OMPL path hard crashes and stop sign…#780

Open
jb3982 wants to merge 5 commits intomainfrom
jb3982/765-Fixing_Pathfinding_error_handling
Open

Fix #765: Add error handling for OMPL path hard crashes and stop sign…#780
jb3982 wants to merge 5 commits intomainfrom
jb3982/765-Fixing_Pathfinding_error_handling

Conversation

@jb3982
Copy link
Contributor

@jb3982 jb3982 commented Feb 3, 2026

Description

This PR addresses the issue where the pathfinding node would crash when OMPL fails to compute a valid state space solution. The changes introduce error handling and a new stop signaling mechanism.

Changes made:

  1. Added sail boolean field to DesiredHeading message to signal when the boat should stop
  2. Implemented try-except blocks in ompl_path.get_path() to catch OMPL exceptions
  3. Added error handling in local_path.update_if_needed() for both new and old path retrievals
  4. Updated node_navigate.desired_heading_callback() to detect pathfinding failures and set sail=False
  5. Documented the new sail field behavior and error handling in custom_interfaces README.md

Behavior:

  • When pathfinding fails, DesiredHeading is published with sail=False, heading=0.0, and steering=0
  • This prevents hard crashes and provides an explicit stop signal to downstream nodes

Verification

  • Created temporary error handling test file (test_error_handling.py) locally and verified all tests pass
    • Tests verify exceptions are caught when get_path() fails
    • Tests confirm update_if_needed() returns (None, waypoint_index) on failure
    • Tests validate desired_heading_callback() sets sail=False when heading is None

Resources

@jb3982 jb3982 requested review from a team and raghumanimehta as code owners February 3, 2026 09:03
Copy link
Contributor

@raghumanimehta raghumanimehta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! Need a couple of changes before I approve this!

You don't need to set any other field except sail in case of failure. Just set the said flag. The other teams handle the rest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the return type to Optional[ci.Path]. Handle this none return everywhere.

return ci.Path(waypoints=waypoints)
except Exception as e:
self._logger.error(f"Exception occurred while getting path from OMPL: {e}")
return ci.Path()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return None here. It makes little sense that empty path is bad.

msg.sail = False
else:
msg.heading.heading = desired_heading
msg.steering = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why's the steering 0 here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I don't think we change steering. Confirm this, please.

@jb3982
Copy link
Contributor Author

jb3982 commented Feb 4, 2026

In the function update_if_needed, the old path case returns wp_index but we calculate updated_wp_index for the old path. so, I think its a bug or maybe I'm wrong. Can you please clarify on this.
I've attached an image of it below:

Screenshot 2026-02-03 at 11 07 52 PM

@raghumanimehta
Copy link
Contributor

In the function update_if_needed, the old path case returns wp_index but we calculate updated_wp_index for the old path. so, I think its a bug or maybe I'm wrong. Can you please clarify on this. I've attached an image of it below:

Screenshot 2026-02-03 at 11 07 52 PM

Yeah, should be updated_wp_index. Thanks for catching that. Please make this chnage in the code.

Copy link
Contributor

@raghumanimehta raghumanimehta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other things that are missing still. Consider the scenarios:

  • What happens when the boat was set tosail == False, when would this be next checked? Have you tested this?
  • What happens if the OMPL failed becuase of other reasons? I believe we could handle this by after the sail = False, try again immediately a couple of times. If we can't get a path, wait for, let's say, 10 minutes before attempting to generate a new path.
    What are your thoughts on these points? @jb3982 @SPDonaghy @FireBoyAJ24

)
except Exception as e:
self._logger.error(f"Failed to get new OMPL path: {e}")
return None, local_waypoint_index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should perhaps return None for both.

)
except Exception as e:
self._logger.error(f"Failed to get old OMPL path: {e}")
return None, local_waypoint_index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also you could perhaps abstract this to a function. DRY principle

self._ompl_path = ompl_path
self.path = self._ompl_path.get_path()
local_path = self._ompl_path.get_path()
if local_path is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should update self.path still. You have logged the error but in the next cycle, the reference to self.path would have a valid path which is incorrect behaviour.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.path = local_path runs after the if block, since there is no return statement. So, won't it set the self.path to None and in this would trigger if old_ompl_path is None or self.path is None: check. where it would call self._update(omple_path)

@jb3982
Copy link
Contributor Author

jb3982 commented Feb 12, 2026

There are other things that are missing still. Consider the scenarios:

  • What happens when the boat was set tosail == False, when would this be next checked? Have you tested this?
  • What happens if the OMPL failed becuase of other reasons? I believe we could handle this by after the sail = False, try again immediately a couple of times. If we can't get a path, wait for, let's say, 10 minutes before attempting to generate a new path.
    What are your thoughts on these points? @jb3982 @SPDonaghy @FireBoyAJ24
  1. The current flow is like this:

    • desired_heading_callback runs and callsget_desired_heading, which then callsupdate_if_needed
    • If OMPL fails => desidered_heading = None => sets sail = False
    • On next tick, desired_heading_callback runs again => If OMPL succeeds in finding path => publishes sail = True
      So the boat's sailing state is re-evaluated every cycle automatically.
  2. Just to confirm I understand correctly: you're proposing something like:
    1. On OMPL failure, immediately retry 2-3 times
    2. If all retries fail, set sail = False and enter a cooldown period (~10 min)
    3. After the cooldown, resume normal pathfinding attempts

I think this is a good idea. Could we use a shorter cooldown than 10 minutes though? Short enough so that our boat isn't drifting aimlessly for too long. What do you think is reasonable?

@raghumanimehta
Copy link
Contributor

There are other things that are missing still. Consider the scenarios:

  • What happens when the boat was set tosail == False, when would this be next checked? Have you tested this?
  • What happens if the OMPL failed becuase of other reasons? I believe we could handle this by after the sail = False, try again immediately a couple of times. If we can't get a path, wait for, let's say, 10 minutes before attempting to generate a new path.
    What are your thoughts on these points? @jb3982 @SPDonaghy @FireBoyAJ24
  1. The current flow is like this:

    • desired_heading_callback runs and callsget_desired_heading, which then callsupdate_if_needed
    • If OMPL fails => desidered_heading = None => sets sail = False
    • On next tick, desired_heading_callback runs again => If OMPL succeeds in finding path => publishes sail = True
      So the boat's sailing state is re-evaluated every cycle automatically.
  2. Just to confirm I understand correctly: you're proposing something like:

    1. On OMPL failure, immediately retry 2-3 times
    2. If all retries fail, set sail = False and enter a cooldown period (~10 min)
    3. After the cooldown, resume normal pathfinding attempts

I think this is a good idea. Could we use a shorter cooldown than 10 minutes though? Short enough so that our boat isn't drifting aimlessly for too long. What do you think is reasonable?

Yes. No, just set it to 10 minutes for now. We can discuss this later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

path Pathfinding team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PATH: Stoping the boat

2 participants