Skip to content

Commit 1e4b9ca

Browse files
committed
Simplifying the approach in on-error_kill.cwl. It now activates the kill switch via a ToolTimeLimit requirement. It also uses a much longer timeout which will hopefully be sufficient for the CI server when it is congested.
1 parent be59356 commit 1e4b9ca

File tree

2 files changed

+47
-49
lines changed

2 files changed

+47
-49
lines changed

tests/test_parallel.py

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -56,13 +56,18 @@ def selectResources(
5656
ks_test = factory.make(get_data(test_file))
5757

5858
# arbitrary test values
59-
sleep_time = 33 # a "sufficiently large" timeout
60-
n_sleepers = 5
59+
sleep_time = 3333 # a "sufficiently large" timeout
60+
n_sleepers = 4
61+
start_time = 0
6162

6263
try:
6364
start_time = time.time()
64-
ks_test(sleep_time=sleep_time)
65+
ks_test(
66+
sleep_time=sleep_time,
67+
n_sleepers=n_sleepers,
68+
)
6569
except WorkflowStatus as e:
6670
end_time = time.time()
67-
assert e.out == {"instructed_sleep_times": [sleep_time] * n_sleepers}
68-
assert end_time - start_time < (sleep_time + 4)
71+
output = e.out["roulette_mask"]
72+
assert len(output) == n_sleepers and sum(output) == 1
73+
assert end_time - start_time < sleep_time

tests/wf/on-error_kill.cwl

Lines changed: 37 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -17,77 +17,70 @@ doc: |
1717

1818

1919
inputs:
20-
sleep_time: { type: int, default: 33 }
21-
n_sleepers: { type: int?, default: 5 }
20+
sleep_time: {type: int, default: 3333}
21+
n_sleepers: {type: int, default: 4}
2222

2323

2424
steps:
25-
make_array:
25+
roulette:
2626
doc: |
27-
This step produces an array of sleep_time values to be used
28-
as inputs for the scatter_step. The array also serves as the
29-
workflow output which should be collected despite the
30-
kill switch triggered in the kill step below.
31-
in: { sleep_time: sleep_time, n_sleepers: n_sleepers }
32-
out: [ times ]
27+
This step produces a boolean array with exactly one true value
28+
whose index is assigned at random.
29+
in: {n_sleepers: n_sleepers}
30+
out: [mask]
3331
run:
3432
class: ExpressionTool
35-
inputs:
36-
sleep_time: { type: int }
37-
n_sleepers: { type: int }
38-
outputs: { times: { type: "int[]" } }
33+
inputs: {n_sleepers: {type: int}}
34+
outputs: {mask: {type: "boolean[]"}}
3935
expression: |
40-
${ return {"times": Array(inputs.n_sleepers).fill(inputs.sleep_time)} }
36+
${
37+
var mask = Array(inputs.n_sleepers).fill(false);
38+
var spin = Math.floor(Math.random() * inputs.n_sleepers);
39+
mask[spin] = true;
40+
return {"mask": mask}
41+
}
4142

4243
scatter_step:
4344
doc: |
4445
This step starts several parallel jobs that each sleep for
45-
sleep_time seconds.
46+
sleep_time seconds. The job whose k_mask value is true will
47+
self-terminate early, thereby activating the kill switch.
4648
in:
47-
time: make_array/times
48-
scatter: time
49-
out: [ ]
49+
time: sleep_time
50+
k_mask: roulette/mask
51+
scatter: k_mask
52+
out: [placeholder]
5053
run:
5154
class: CommandLineTool
55+
requirements:
56+
ToolTimeLimit:
57+
timelimit: '${return inputs.k_mask ? 5 : inputs.time + 5}' # 5 is an arbitrary value
5258
baseCommand: sleep
5359
inputs:
54-
time: { type: int, inputBinding: { position: 1 } }
55-
outputs: { }
56-
57-
kill:
58-
doc: |
59-
This step waits a few seconds and selects a random scatter_step job to kill.
60-
When `--on-error kill` is used, the runner should respond by terminating all
61-
remaining jobs and exiting. This means the workflow's overall runtime should be
62-
much less than max(sleep_time). The input force_upstream_order ensures that
63-
this step runs after make_array, and therefore roughly parallel to scatter_step.
64-
in:
65-
force_upstream_order: make_array/times
66-
sleep_time: sleep_time
67-
search_str:
68-
valueFrom: $("sleep " + inputs.sleep_time)
69-
out: [ pid ]
70-
run: ../process_roulette.cwl
60+
time: {type: int, inputBinding: {position: 1}}
61+
k_mask: {type: boolean}
62+
outputs:
63+
placeholder: {type: string, outputBinding: {outputEval: $("foo")}}
7164

7265
dangling_step:
7366
doc: |
7467
This step should never run. It confirms that additional jobs aren't
7568
submitted and allowed to run to completion after the kill switch has
76-
been set. The input force_downstream_order ensures that this step runs
77-
after the kill step.
69+
been set. The input force_downstream_order ensures that this step
70+
doesn't run before scatter_step completes.
7871
in:
79-
force_downstream_order: kill/pid
72+
force_downstream_order: scatter_step/placeholder
8073
time: sleep_time
81-
out: [ ]
74+
out: []
8275
run:
8376
class: CommandLineTool
8477
baseCommand: sleep
8578
inputs:
86-
time: { type: int, inputBinding: { position: 1 } }
87-
outputs: { }
79+
time: {type: int, inputBinding: {position: 1}}
80+
outputs: {}
8881

8982

9083
outputs:
91-
instructed_sleep_times:
92-
type: int[]
93-
outputSource: make_array/times
84+
roulette_mask:
85+
type: boolean[]
86+
outputSource: roulette/mask

0 commit comments

Comments
 (0)