Skip to content

Commit 0685765

Browse files
authored
Don't strategize executors in bad state (#3994)
This is consistent with JobStatusPoller.close, which does not scale in executors in bad state. See issue #3992 for more context. This fixes a race condition that I can create by adding fuzz delays (as described in #3992) and that appeared (without adding fuzz timing) in PR #3991 probably due to reduced time taken per strategy iteration in that PR. The new use of `bad_state_is_set` is prone to race conditions - that's part of the model of how bad_state_is_set works everywhere in the codebase, and is briefly discussed in PR #3995. # Changed Behaviour Scaling will not happen for bad-state executors. I think this is always the right thing to do. ## Type of change - Bug fix
1 parent c14b729 commit 0685765

File tree

2 files changed

+6
-0
lines changed

2 files changed

+6
-0
lines changed

parsl/jobs/strategy.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,11 @@ def _general_strategy(self, executors: List[BlockProviderExecutor], *, strategy_
185185

186186
for executor in executors:
187187
label = executor.label
188+
189+
if executor.bad_state_is_set:
190+
logger.info(f"Not strategizing for executor {label} because bad state is set")
191+
continue
192+
188193
logger.debug(f"Strategizing for executor {label}")
189194

190195
if self.executors[label]['first']:

parsl/tests/test_scaling/test_regression_3696_oscillation.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ def test_htex_strategy_does_not_oscillate(ns):
5151
executor.outstanding = lambda: n_tasks
5252
executor.status_facade = statuses
5353
executor.workers_per_node = n_workers
54+
executor.bad_state_is_set = False
5455

5556
provider.parallelism = 1
5657
provider.init_blocks = 0

0 commit comments

Comments
 (0)