-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Parallel configure/activate in lifecycle manager #5541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Johannes Plapp <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix some issues due to generative AI. There are unused defined variables and no error logging. Please review your code a bit more carefully before opening PRs with gen AI 😉
A design question: we could, rather than a large set of duplicated code for parallel processing or not, change this to having an if (!parallel_state_transitions_) {future.get();}
type action in the loop where we create the futures. That way we can still do sequentially if we want without much, if any, code duplication to support both features. Then in the 'for each future' loop, we only do that if parallel_state_transitions_
.
Spinning up the threads might take a bit extra time, but as long as its not much, I think that design simplicity is worth an extra couple hundred milliseconds.
More an FYI, but ros2 launch API now has an autostart
field I added so you can autostart lifecycle nodes and components without a manager if you choose.
CI is failing I believe due to another PR merged recently. Can you rebase / pull in main
?
If that doesn't fix it, change all the v39 to v40 in this file https://github.com/ros-navigation/navigation2/blob/main/.circleci/config.yml#L41 (there are 3 of them).
declare_parameter("service_timeout", 5.0); | ||
declare_parameter("bond_respawn_max_duration", 10.0); | ||
declare_parameter("attempt_respawn_reconnection", true); | ||
declare_parameter("parallel_state_transitions", rclcpp::ParameterValue(true)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would want this default off since Nav2 servers do indeed depend on each other.
bond_respawn_max_duration_ = rclcpp::Duration::from_seconds(respawn_timeout_s); | ||
|
||
get_parameter("attempt_respawn_reconnection", attempt_respawn_reconnection_); | ||
get_parameter("parallel_state_transitions", parallel_state_transitions_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs.nav2.org needs to be updated with the configuration guide for the new parameter. Also a migration guide entry talking about this feature and some metrics would be nice so other users are aware.
|
||
if (!success && !hard_change) { | ||
uint8_t state = node_map_[node_name]->get_state(); | ||
if (!strcmp(reinterpret_cast<char *>(&state), "Inactive")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have the transition_state_map_
we should probably use corresponding to the transition being completed
if (!success && !hard_change) { | ||
uint8_t state = node_map_[node_name]->get_state(); | ||
if (!strcmp(reinterpret_cast<char *>(&state), "Inactive")) { | ||
inactive_nodes += node_name + delimiter; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unused.
if (!strcmp(reinterpret_cast<char *>(&state), "Inactive")) { | ||
inactive_nodes += node_name + delimiter; | ||
} else { | ||
unconfigured_nodes += node_name + delimiter; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unused.
return false; | ||
/* Function partially created using claude */ | ||
size_t active_nodes_count = 0; | ||
std::string nodes_in_error_state = ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unused.
std::string nodes_in_error_state = ""; | ||
std::string unconfigured_nodes = ""; | ||
std::string inactive_nodes = ""; | ||
std::string delimiter(", "); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be necessary, we have the information in the code necessary to check state returns
Basic Info
Description of contribution in a few bullet points
we noticed that our lifecycle nodes don't depend being configured/activated in sequence, so we can speed up robot launch by activating everything at once. For us, on the robot it reduces launch speed (until "all managed nodes are active") from 51 seconds to 35 seconds.
I tried with the simulation from this repo by running some system test:
colcon test --packages-select nav2_system_tests --event-handler=console_direct+ --ctest-args --output-on-failure -R _error_msg$
and after I removed some arbitrary long sleeps from the tester node I got
with this PR: 6 seconds for configure+activate; overall test 50 seconds
without this PR: 7 seconds for configure+activate; overall test 52 seconds (as deactivate is also faster)
so, not that much, but the realworld benefit at least for us is significant. Let me know if that is something you want to add, then we can polish this PR.
Description of documentation updates required from your changes
Description of how this change was tested
Tested on our robot and in the nav2 simulation.
is running on productive robots since a couple weeks, but as this only concerns launching that doesn't mean so much
Future work that may be required in bullet points
For Maintainers:
backport-*
.