Skip to content

[CI] Time out from test_force_torque_sensor_broadcaster #1586

Merged
christophfroehlich merged 15 commits intoros-controls:masterfrom
Juliaj:ci_issue_1574
Mar 26, 2025
Merged

[CI] Time out from test_force_torque_sensor_broadcaster #1586
christophfroehlich merged 15 commits intoros-controls:masterfrom
Juliaj:ci_issue_1574

Conversation

@Juliaj
Copy link
Copy Markdown
Contributor

@Juliaj Juliaj commented Mar 12, 2025

This change attempts to debug and reproduce CI failures: #1574 and #1559.

In both issues, it appears that the timeout was caused by the test "InterfaceNames_Configure_Success", which primarily tests the on_configure function. No hypothesis has been established so far.

  • The code in "InterfaceNames_Configure_Success" is straightforward.
  • I’ve reviewed the on_configure function in ForceTorqueSensorBroadcaster and the changes introduced around the time of the failure. Other than a change made in realtime_tools regarding RTPublisher, no other changes were found that could impact on_configure.

Copy link
Copy Markdown
Member

@saikishor saikishor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've a feeling that it happens mostly or only in Humble. Right now the Rolling compatibility build of Humble is also failed in this PR

https://github.com/ros-controls/ros2_controllers/actions/runs/13800804076/job/38602688612#step:5:1

@Juliaj
Copy link
Copy Markdown
Contributor Author

Juliaj commented Mar 12, 2025

I've a feeling that it happens mostly or only in Humble. Right now the Rolling compatibility build of Humble is also failed in this PR

Setup a docker container with Humble, no repro ...

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 12, 2025

Codecov Report

Attention: Patch coverage is 61.53846% with 5 lines in your changes missing coverage. Please review.

Project coverage is 85.07%. Comparing base (eca84fa) to head (601c73b).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
...ster/test/test_force_torque_sensor_broadcaster.cpp 61.53% 4 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1586      +/-   ##
==========================================
- Coverage   85.10%   85.07%   -0.03%     
==========================================
  Files         123      123              
  Lines       11699    11710      +11     
  Branches      995      996       +1     
==========================================
+ Hits         9956     9962       +6     
- Misses       1432     1435       +3     
- Partials      311      313       +2     
Flag Coverage Δ
unittests 85.07% <61.53%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...ster/test/test_force_torque_sensor_broadcaster.cpp 94.66% <61.53%> (-3.18%) ⬇️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Juliaj
Copy link
Copy Markdown
Contributor Author

Juliaj commented Mar 13, 2025

@christophfroehlich, I only hit the failure once (as Sai pointed out above) and haven't been able to reproduce it in subsequent attempts. Is there anything else I should try to reproduce this issue?

There are failures in the Coverage build and Rolling Binary Build. The test failure in the coverage build appears to be intermittent. The failures in the rolling binary build seem to be related to a dependency issue. Please let me know if these need further investigation.

@saikishor, the controller loading tests reported warnings related to introspection. I'm unable to determine if this is expected behavior.

 1: [ RUN      ] TestLoadAdmittanceController.load_controller
  1: [INFO] [1741819586.401847780] [test_controller_manager]: Using Steady (Monotonic) clock for triggering controller manager cycles.
  1: [INFO] [1741819586.433478203] [test_controller_manager]: Loading hardware 'TestActuatorHardware' 
[...]
  1: [WARN] [1741819586.435269427] [pal_statistics]: Unable to register entity state_interface.joint1/position in ros2_control, as the registry is not found. Try initializing it!
  1: [WARN] [1741819586.435294563] [pal_statistics]: Unable to register entity state_interface.joint1/velocity in ros2_control, as the registry is not found. Try initializing it!

@saikishor
Copy link
Copy Markdown
Member

saikishor commented Mar 13, 2025

@saikishor, the controller loading tests reported warnings related to introspection. I'm unable to determine if this is expected behavior.

 1: [ RUN      ] TestLoadAdmittanceController.load_controller
  1: [INFO] [1741819586.401847780] [test_controller_manager]: Using Steady (Monotonic) clock for triggering controller manager cycles.
  1: [INFO] [1741819586.433478203] [test_controller_manager]: Loading hardware 'TestActuatorHardware' 
[...]
  1: [WARN] [1741819586.435269427] [pal_statistics]: Unable to register entity state_interface.joint1/position in ros2_control, as the registry is not found. Try initializing it!
  1: [WARN] [1741819586.435294563] [pal_statistics]: Unable to register entity state_interface.joint1/velocity in ros2_control, as the registry is not found. Try initializing it!

Yes, this is normal. I'll reduce the spam, this needs to be handled at the pal_statistics level. I'll take care of it.

I've retriggered the CI, let's see if we can reproduce it.

@christophfroehlich
Copy link
Copy Markdown
Member

We had some similar timeouts in the control_toolbox repo, fixed with ros-controls/control_toolbox#261
I haven't checked yet, but does this apply here maybe too?

@Juliaj
Copy link
Copy Markdown
Contributor Author

Juliaj commented Mar 26, 2025

We had some similar timeouts in the control_toolbox repo, fixed with ros-controls/control_toolbox#261 I haven't checked yet, but does this apply here maybe too?

@christophfroehlich , thank you for pointing that out. It’s hard to tell whether this is similar to the one you mentioned above. To reduce the chance, I’ve updated the test teardown logic with more cleanup. I also implemented a timeout mechanism to prevent tests from hanging indefinitely. Could you please review these changes and let me know your thoughts?

@saikishor
Copy link
Copy Markdown
Member

@Juliaj I believe we also have a default timeout of 60 secs for the gtest. At least that's what I thought, isn't it the case @christophfroehlich ?

@christophfroehlich
Copy link
Copy Markdown
Member

Exactly, 60s is the default, if not specified otherwise. Can we split this PR and see if the proper cleanup already helps?

Copy link
Copy Markdown
Member

@christophfroehlich christophfroehlich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's give it a try

Copy link
Copy Markdown
Member

@christophfroehlich christophfroehlich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference: We already added an auto-shutdown in the destructor
https://github.com/ros-controls/ros2_control/blob/95188b92c9bf628e72dc90a671e1f6c00f0b7611/controller_interface/src/controller_interface_base.cpp#L26-L40

At assigning a new unique_ptr with fts_broadcaster_ = std::make_unique<FriendForceTorqueSensorBroadcaster>(); it should call this transition already.

@christophfroehlich christophfroehlich merged commit 996f030 into ros-controls:master Mar 26, 2025
25 checks passed
@Juliaj Juliaj deleted the ci_issue_1574 branch March 26, 2025 20:53
@christophfroehlich christophfroehlich added the backport-humble Triggers PR backport to ROS 2 humble. label Mar 26, 2025
mergify bot pushed a commit that referenced this pull request Mar 26, 2025
(cherry picked from commit 996f030)

# Conflicts:
#	force_torque_sensor_broadcaster/test/test_force_torque_sensor_broadcaster.cpp
@christophfroehlich
Copy link
Copy Markdown
Member

Ah, I saw now that we haven't backported ros-controls/ros2_control#1979. This might be a reason why it fails more often with the humble version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-humble Triggers PR backport to ROS 2 humble.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants