Skip to content

Commit b813111

Browse files
committed
jobs/build: stop waiting for multi-arch jobs to take lock
The main reason we added that was because in the new "rerun build-arch and release jobs" path, there was a higher likelihood that the release job could in theory take the locks before the build-arch jobs. But with 0664cd6 ("jobs/build: wait when re-running mArch jobs"), this is no longer a concern. There's still the theoretical possibility the race happens even in the regular path (especially when `EARLY_ARCH_JOBS` is unset), but (1) something must be really slow in the multi-arch jobs for that to happen (in which case, it might end up taking more than our 5 minute timeout anyway) and (2) the worst case is that we release without that arch before it's built, which is salvageable (by rerunning the release job). So overall, IMO maintaining this code is not worth the complexity. We can always bring it back and adjust the timeout if this is a recurring issue.
1 parent 0664cd6 commit b813111

File tree

1 file changed

+0
-32
lines changed

1 file changed

+0
-32
lines changed

jobs/build.Jenkinsfile

Lines changed: 0 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
import org.yaml.snakeyaml.Yaml;
2-
import org.jenkinsci.plugins.workflow.steps.FlowInterruptedException;
32

43
node {
54
checkout scm
@@ -522,16 +521,6 @@ def run_multiarch_jobs(arches, src_commit, version, cosa_img, wait) {
522521
string(name: 'PIPECFG_HOTFIX_REPO', value: params.PIPECFG_HOTFIX_REPO),
523522
string(name: 'PIPECFG_HOTFIX_REF', value: params.PIPECFG_HOTFIX_REF)
524523
]
525-
if (!wait) {
526-
// Wait until the locks taken by the `build-arch` jobs are taken
527-
// before continuing. This closes a potential race in which once we
528-
// trigger the `release` job afterwards, it could end up taking the
529-
// locks before the multi-arch jobs.
530-
// This really should never take more than 5 minutes. Having a
531-
// timeout ensures we don't wait for a long time if we somehow
532-
// missed the transition.
533-
wait_until_locked_or_continue("release-${version}-${arch}", 5)
534-
}
535524
}]}
536525
}
537526
}
@@ -552,24 +541,3 @@ def run_release_job(buildID) {
552541
]
553542
}
554543
}
555-
556-
// XXX: generalize and put in coreos-ci-lib eventually
557-
def wait_until_locked_or_continue(resource, timeout_mins) {
558-
try {
559-
timeout(time: timeout_mins, unit: 'MINUTES') {
560-
waitUntil {
561-
lock(resource: resource, skipIfLocked: true) {
562-
return false
563-
}
564-
return true
565-
}
566-
}
567-
} catch (FlowInterruptedException e) {
568-
// If the lock was still not taken, then something went wrong. For
569-
// example, the job might've failed during the initial `git clone`. The
570-
// timeout is to ensure we don't wait forever and here we continue to
571-
// try to at least release for the arches that did succeed. We may be
572-
// able to salvage the failed arch in the next run.
573-
echo "Timed out waiting for lock ${resource} to be taken. Continuing..."
574-
}
575-
}

0 commit comments

Comments
 (0)