[CI] Time out from test_force_torque_sensor_broadcaster #1586
[CI] Time out from test_force_torque_sensor_broadcaster #1586christophfroehlich merged 15 commits intoros-controls:masterfrom
Conversation
saikishor
left a comment
There was a problem hiding this comment.
I've a feeling that it happens mostly or only in Humble. Right now the Rolling compatibility build of Humble is also failed in this PR
https://github.com/ros-controls/ros2_controllers/actions/runs/13800804076/job/38602688612#step:5:1
Setup a docker container with Humble, no repro ... |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1586 +/- ##
==========================================
- Coverage 85.10% 85.07% -0.03%
==========================================
Files 123 123
Lines 11699 11710 +11
Branches 995 996 +1
==========================================
+ Hits 9956 9962 +6
- Misses 1432 1435 +3
- Partials 311 313 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
@christophfroehlich, I only hit the failure once (as Sai pointed out above) and haven't been able to reproduce it in subsequent attempts. Is there anything else I should try to reproduce this issue? There are failures in the Coverage build and Rolling Binary Build. The test failure in the coverage build appears to be intermittent. The failures in the rolling binary build seem to be related to a dependency issue. Please let me know if these need further investigation. @saikishor, the controller loading tests reported warnings related to introspection. I'm unable to determine if this is expected behavior. |
Yes, this is normal. I'll reduce the spam, this needs to be handled at the pal_statistics level. I'll take care of it. I've retriggered the CI, let's see if we can reproduce it. |
|
We had some similar timeouts in the control_toolbox repo, fixed with ros-controls/control_toolbox#261 |
@christophfroehlich , thank you for pointing that out. It’s hard to tell whether this is similar to the one you mentioned above. To reduce the chance, I’ve updated the test teardown logic with more cleanup. I also implemented a timeout mechanism to prevent tests from hanging indefinitely. Could you please review these changes and let me know your thoughts? |
|
@Juliaj I believe we also have a default timeout of 60 secs for the gtest. At least that's what I thought, isn't it the case @christophfroehlich ? |
|
Exactly, 60s is the default, if not specified otherwise. Can we split this PR and see if the proper cleanup already helps? |
christophfroehlich
left a comment
There was a problem hiding this comment.
Let's give it a try
christophfroehlich
left a comment
There was a problem hiding this comment.
For reference: We already added an auto-shutdown in the destructor
https://github.com/ros-controls/ros2_control/blob/95188b92c9bf628e72dc90a671e1f6c00f0b7611/controller_interface/src/controller_interface_base.cpp#L26-L40
At assigning a new unique_ptr with fts_broadcaster_ = std::make_unique<FriendForceTorqueSensorBroadcaster>(); it should call this transition already.
(cherry picked from commit 996f030) # Conflicts: # force_torque_sensor_broadcaster/test/test_force_torque_sensor_broadcaster.cpp
|
Ah, I saw now that we haven't backported ros-controls/ros2_control#1979. This might be a reason why it fails more often with the humble version. |
This change attempts to debug and reproduce CI failures: #1574 and #1559.
In both issues, it appears that the timeout was caused by the test "InterfaceNames_Configure_Success", which primarily tests the
on_configurefunction. No hypothesis has been established so far.on_configurefunction inForceTorqueSensorBroadcasterand the changes introduced around the time of the failure. Other than a change made in realtime_tools regardingRTPublisher, no other changes were found that could impact on_configure.