Skip to content

Commit 68fd13e

Browse files
committed
testsuite: fix potential races in t4011-match-duration.t
Problem: There are two potential race conditions in the final test in t4011-match-duration.t which ensures the internal expiration of the scheduler resource is adjusted after an expiration update. 1. A check for the default duration of a submitted job assumes the duration will be strictly less than the instance duration, but if the job starts within the same second as the instance starttime then the duration could match exactly the instance duration since Fluxion deals in whole seconds. 2. The scheduler is notified of the expiration update asynchronously with respect to the instance update, which is detected in the test via the resource-update event in the resource.eventlog. This could result in the second submitted job receiving an unexpected default duration. Fix case 1 above by allowing the job duration to match (but not exceed) the instance duration. Fix case 2 by blocking the test until the scheduler emits a log message for the duration update.
1 parent c5c1fe5 commit 68fd13e

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

t/t4011-match-duration.t

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@ test_under_flux 2
1212
flux setattr log-stderr-level 1
1313
export FLUX_URI_RESOLVE_LOCAL=t
1414

15+
dmesg_grep="flux python ${SHARNESS_TEST_SRCDIR}/scripts/dmesg-grep.py"
16+
1517
# Ensure fluxion modules are loaded under flux-alloc(1)
1618
test_expect_success 'set FLUX_RC_EXTRA so Fluxion modules are loaded under flux-alloc' '
1719
mkdir rc1.d &&
@@ -102,7 +104,7 @@ test_expect_success FLUX_UPDATE_RUNNING \
102104
id1=$(flux proxy $id flux submit sleep 300) &&
103105
duration1=$(subinstance_get_job_duration $id $id1) &&
104106
test_debug "echo initial duration of subinstance job1 is $duration1" &&
105-
echo $duration1 | jq -e ". < 300" &&
107+
echo $duration1 | jq -e ". <= 300" &&
106108
test_debug "echo updating duration of alloc job +5m" &&
107109
flux update $id duration=+5m &&
108110
test_debug "echo waiting for resource-update event" &&
@@ -111,6 +113,8 @@ test_expect_success FLUX_UPDATE_RUNNING \
111113
exp2=$(subinstance_get_expiration $id) &&
112114
test_debug "echo expiration updated from $exp1 to $exp2" &&
113115
echo $exp2 | jq -e ". == $exp1 + 300" &&
116+
flux proxy $id $dmesg_grep -vt 30 \
117+
\"sched.*resource expiration updated\" &&
114118
id2=$(flux proxy $id flux submit sleep 300) &&
115119
duration2=$(subinstance_get_job_duration $id $id2) &&
116120
test_debug "echo duration of subinstance job2 is $duration2" &&

0 commit comments

Comments
 (0)