-
Notifications
You must be signed in to change notification settings - Fork 249
Description
Describe the bug
Context: forked from the discussion on Discord https://discord.com/channels/914168414178779197/940584045287460885/1436370380628426936.
The CongestionControl policy Drop is dropping large messages, even if there is no network congestion. Sending a single 5MB message over a WiFi connection (200 Mbits up/down) always results in the message being dropped.
There are some configuration knobs that can be tweaked to increase internal timeouts (wait_before_drop, max_wait_before_drop_fragments), but one does not fundamentally solve the issue for an arbitrary sized message.
Observations:
- the Drop policy should not drop a large messages midway just because it is a large message and some internal timeout expires - it should try to deliver it and only drop new arriving messages.
- the Drop policy is blocking the calling thread, arguably with a timeout (
max_wait_before_drop_fragments). This is very misleading. Why isn't the Drop policy just store the message in an internal queue for async dispatching, and release the calling thread? - the default value for
max_wait_before_drop_fragmentsis 50 milliseconds. For successful delivery of a 5MB message, the necessary network bandwidth should be at least 800Mbits. This seems excessive. Is this the best default value?
Taking a step back, the overall feedback is that we should aim for Zenoh to work out of the box for standard cases. The configuration knobs should be reserved solely for optimizing a traffic pattern.
To reproduce
Start two nodes in peer mode, connected via WiFi. Send a message of 5-10MB. The subscriber never receives any message.
System info
Zenoh Cpp 1.5.0
Ubuntu 24.04