Commit 032e3cb
Cluster: Avoid usage of light weight messages to nodes with not ready bidirectional links (#2817)
After network failure nodes that come back to cluster do not always send
and/or receive messages from other nodes in shard, this fix avoids usage
of light weight messages to nodes with not ready bidirectional links.
When a light message comes before any normal message, freeing of cluster
link is happening because on the just established connection link->node
is not assigned yet. It is assigned in getNodeFromLinkAndMsg right after
the condition if (is_light).
So on a cluster with heavy pubsub load a long loop of disconnects is
possible, and we got this.
1. node A establishes cluster link to node B
2. node A propagates PUBLISH to node B
3. node B frees cluster link because of link->node == null as it has not
received non-light messages yet
4. go to 1.
During this loop subscribers of node B does not receive any messages
published to node A.
So here we want to make sure that PING was sent (and link->node was
initialized) on this connection before using lightweight messages.
---------
Signed-off-by: Daniil Kashapov <[email protected]>
Co-authored-by: Harkrishn Patro <[email protected]>1 parent bf03b0c commit 032e3cb
2 files changed
+12
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4828 | 4828 | | |
4829 | 4829 | | |
4830 | 4830 | | |
| 4831 | + | |
| 4832 | + | |
| 4833 | + | |
| 4834 | + | |
| 4835 | + | |
| 4836 | + | |
| 4837 | + | |
| 4838 | + | |
| 4839 | + | |
| 4840 | + | |
| 4841 | + | |
| 4842 | + | |
4831 | 4843 | | |
4832 | 4844 | | |
4833 | 4845 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
78 | | - | |
79 | | - | |
80 | 78 | | |
81 | 79 | | |
82 | 80 | | |
| |||
0 commit comments