Skip to content

Commit 8c19227

Browse files
committed
osd/PeeringState: handle race condition of DeferBackfill event for Backfilling state
Currently when PG in `Backfilling` state receives a `DeferBackfill` event, there are cases when that event could race with `MOSDPGBackfill::OP_BACKFILL_FINISH` becasue the PG has already finished backfilling. In such case, the following happens: 1. PG state set to `PG_STATE_BACKFILL_WAIT` 2. Suspend backfilling 3. Discard the event Notice that we do not reschedule backfill in the above steps, this can lead to a situation where the PG gets stuck in a `backfill_wait` state forever. This bug got introduced due to the following commit: `865839f`: osd/PeeringState: check racing with OP_BACKFILL_FINISH when defering backfill PR Link: ceph#60185 This commit, fixes that by making sure that in race conditions such as above - we only discard the event. Fixes: https://tracker.ceph.com/issues/71010 Signed-off-by: Naveen Naidu <[email protected]> (cherry picked from commit b2bd15b)
1 parent da27661 commit 8c19227

File tree

1 file changed

+8
-6
lines changed

1 file changed

+8
-6
lines changed

src/osd/PeeringState.cc

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5411,13 +5411,12 @@ boost::statechart::result
54115411
PeeringState::Backfilling::react(const DeferBackfill &c)
54125412
{
54135413
DECLARE_LOCALS;
5414-
5415-
psdout(10) << "defer backfill, retry delay " << c.delay << dendl;
5416-
ps->state_set(PG_STATE_BACKFILL_WAIT);
5417-
ps->state_clear(PG_STATE_BACKFILLING);
5418-
suspend_backfill();
5419-
54205414
if (ps->needs_backfill()) {
5415+
psdout(10) << "defer backfill, retry delay " << c.delay << dendl;
5416+
ps->state_set(PG_STATE_BACKFILL_WAIT);
5417+
ps->state_clear(PG_STATE_BACKFILLING);
5418+
suspend_backfill();
5419+
54215420
pl->schedule_event_after(
54225421
std::make_shared<PGPeeringEvent>(
54235422
ps->get_osdmap_epoch(),
@@ -5427,6 +5426,9 @@ PeeringState::Backfilling::react(const DeferBackfill &c)
54275426
return transit<NotBackfilling>();
54285427
} else {
54295428
// raced with MOSDPGBackfill::OP_BACKFILL_FINISH, ignore
5429+
psdout(10) << "discarding stale DeferBackfill event , pg does not need "
5430+
"backfill anymore"
5431+
<< dendl;
54305432
return discard_event();
54315433
}
54325434
}

0 commit comments

Comments
 (0)