Skip to content

Commit 0ba34b1

Browse files
committed
broker: provision dead brokers for flub replacement
Problem: there is no way to replace a node in Flux instance that goes down. Call overlay_flub_provision () when a rank goes offline so that the flub allocator can allocate its rank to a replacement. Unprovision ranks when they return to online.
1 parent 1a59ee7 commit 0ba34b1

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

src/broker/state_machine.c

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -925,6 +925,24 @@ static void broker_online_cb (flux_future_t *f, void *arg)
925925
}
926926
idset_destroy (loss);
927927
}
928+
/* A broker that drops out of s->quorum.online is provisioned
929+
* for replacement via flub, and unprovisioned if it returns.
930+
*/
931+
if (previous_online) {
932+
unsigned int id;
933+
id = idset_first (previous_online);
934+
while (id != IDSET_INVALID_ID) { // online -> offline
935+
if (!idset_test (s->quorum.online, id))
936+
(void)overlay_flub_provision (s->ctx->overlay, id, id, true);
937+
id = idset_next (previous_online, id);
938+
}
939+
id = idset_first (s->quorum.online);
940+
while (id != IDSET_INVALID_ID) { // offline -> online
941+
if (!idset_test (previous_online, id))
942+
(void)overlay_flub_provision (s->ctx->overlay, id, id, false);
943+
id = idset_next (s->quorum.online, id);
944+
}
945+
}
928946

929947
idset_destroy (previous_online);
930948
flux_future_reset (f);

0 commit comments

Comments
 (0)