Commit df5c79a
Cluster: Avoid usage of light weight messages to nodes with not ready bidirectional links (#2817)
After network failure nodes that come back to cluster do not always send
and/or receive messages from other nodes in shard, this fix avoids usage
of light weight messages to nodes with not ready bidirectional links.
When a light message comes before any normal message, freeing of cluster
link is happening because on the just established connection link->node
is not assigned yet. It is assigned in getNodeFromLinkAndMsg right after
the condition if (is_light).
So on a cluster with heavy pubsub load a long loop of disconnects is
possible, and we got this.
1. node A establishes cluster link to node B
2. node A propagates PUBLISH to node B
3. node B frees cluster link because of link->node == null as it has not
received non-light messages yet
4. go to 1.
During this loop subscribers of node B does not receive any messages
published to node A.
So here we want to make sure that PING was sent (and link->node was
initialized) on this connection before using lightweight messages.
---------
Signed-off-by: Daniil Kashapov <[email protected]>
Co-authored-by: Harkrishn Patro <[email protected]>1 parent 8753cb3 commit df5c79a
2 files changed
+13
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
| 47 | + | |
47 | 48 | | |
48 | 49 | | |
49 | 50 | | |
| |||
4400 | 4401 | | |
4401 | 4402 | | |
4402 | 4403 | | |
| 4404 | + | |
| 4405 | + | |
| 4406 | + | |
| 4407 | + | |
| 4408 | + | |
| 4409 | + | |
| 4410 | + | |
| 4411 | + | |
| 4412 | + | |
| 4413 | + | |
| 4414 | + | |
| 4415 | + | |
4403 | 4416 | | |
4404 | 4417 | | |
4405 | 4418 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
67 | 67 | | |
68 | 68 | | |
69 | 69 | | |
70 | | - | |
71 | | - | |
72 | 70 | | |
73 | 71 | | |
74 | 72 | | |
| |||
0 commit comments