Commit c29d6ff
committed
mon/Elector.cc: prevent assertion failure when receiving pings from removed monitors
When a monitor is removed from the cluster, there can be a race condition
where the removed monitor is still running and sending ping messages to
other monitors, while those monitors have already updated their monmap
and no longer recognize the removed monitor's address.
This causes MonMap::get_rank() to return -1 for the removed monitor's
address, which then gets passed to MonMap::get_addrs(unsigned), causing
an assertion failure since -1 cast to unsigned becomes UINT_MAX.
Add defensive checks in three places to handle this scenario:
1. In begin_peer_ping(): return early if peer < 0
2. In send_peer_ping(): check both peer < 0 and peer >= ranks.size()
3. In handle_ping(): drop messages from unknown senders (rank < 0)
This prevents the assertion failure and provides better logging for
diagnosing such race conditions.
Fixes: https://tracker.ceph.com/issues/71259
Signed-off-by: chungfengz <[email protected]>1 parent b69aef5 commit c29d6ff
1 file changed
+10
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
453 | 453 | | |
454 | 454 | | |
455 | 455 | | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
456 | 460 | | |
457 | 461 | | |
458 | 462 | | |
| |||
492 | 496 | | |
493 | 497 | | |
494 | 498 | | |
495 | | - | |
| 499 | + | |
496 | 500 | | |
497 | 501 | | |
498 | 502 | | |
| |||
609 | 613 | | |
610 | 614 | | |
611 | 615 | | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
612 | 621 | | |
613 | 622 | | |
614 | 623 | | |
| |||
0 commit comments