Service Response Timeout - Race Condition

### Generated by Generative AI

_No response_

### Operating System:

Ubuntu 22.04 & 24.04 Server and Desktop and probably more

### ROS version or commit hash:

Humble & Jazzy & probably Rolling

### RMW implementation (if applicable):

rmw_fastrtps_cpp

### RMW Configuration (if applicable):

_No response_

### Client library (if applicable):

_No response_

### 'ros2 doctor --report' output

Not provided because the issue is happening broadly across many different systems


### Steps to reproduce issue

Set up a system that launches a great number of nodes at the same time including lifecycle nodes (for example ROS 2 control and Nav2 nodes). Issue is exacerbated by simple discovery or with a discovery server only available over a wireless network.

These systems have good computers and are reasonable systems (run platform including ros2 control to drive + nav2 + a lidar and a camera or a manipulator). They have reasonable CPU power / RAM etc.

### Expected behavior

Although discovery may take some time, all nodes should launch properly and lifecycle nodes should transition properly between states, more specifically never missing a response.

### Actual behavior

When the nodes all start up, some of the nodes fail to transition because of timeout errors "failed to send response (timeout): client will not receive response". Despite extending the timeout, these issues still keep happening. In some of the instances the node was confirmed to have transitioned correctly although the response was never received. Looking at the code this is likely the known race condition referenced by the todo message found at: https://github.com/ros2/rmw_fastrtps/blob/96aff40c2f1027b2dcc0a3bc4d793a83dc90e85e/rmw_fastrtps_shared_cpp/src/rmw_response.cpp#L139 and described in the associated DDS spec. 

To quote the DDS spec that is referenced:

> Service discovery for the Basic Service Mapping is not robust because discovery race conditions can cause the service replies to be lost. The request-topic and reply-topic are two different RTPS sessions that are matched independently by the DDS discovery process. For this reason it is possible for the entities on the request topic to discover each other before the entities on the reply topic discover each other. If such a situation, if a client makes a request before the entities over the reply topic are fully discovered, the client may lose the corresponding replies.

The todo says that it should be re-implemented with the Enhanced Service Mapping.

### Additional information

Is there a reason that the enhanced service mapping has not been implemented? or is it just a matter of time or having someone contribute it? In my eyes this is a major issue so I would like to engage in the conversation of what it would take to get this fixed.

I have personally been seeing this issue affect Clearpath Robotics robots, Turtlebot 4s and in general Nav2 users across Humble and Jazzy. I have seen a number of tickets about symptoms that are likely caused by this root issue across these repos so I do believe it is affecting a lot of people.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Service Response Timeout - Race Condition #842

Generated by Generative AI

Operating System:

ROS version or commit hash:

RMW implementation (if applicable):

RMW Configuration (if applicable):

Client library (if applicable):

'ros2 doctor --report' output

Steps to reproduce issue

Expected behavior

Actual behavior

Additional information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Service Response Timeout - Race Condition #842

Description

Generated by Generative AI

Operating System:

ROS version or commit hash:

RMW implementation (if applicable):

RMW Configuration (if applicable):

Client library (if applicable):

'ros2 doctor --report' output

Steps to reproduce issue

Expected behavior

Actual behavior

Additional information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions