Skip to content

Commit 4133144

Browse files
committed
jobs/build: increase timeout when WAIT_FOR_RELEASE_JOB set
Our default 240 minute timeout won't be enough when `WAIT_FOR_RELEASE_JOB` is set since the release job itself needs to wait for all the build-arch jobs to finish before proceeding. Set the timeout to a higher value accordingly. A downside with this is that it increases the timeout for all the other bits of the job too. Notably cloud uploads are susceptible to hangs if the infra is flaky (as it recently happened with Alibaba). To counter this, add a timeout to cloud uploads. I chose 45 mins after looking at past times in the available build history; the longest cloud uploads ever took was 23 mins, so 45 mins should be a safe value. We can add more scopes elsewhere in the future if we feel it's warranted.
1 parent 4c02574 commit 4133144

File tree

3 files changed

+21
-3
lines changed

3 files changed

+21
-3
lines changed

jobs/build-arch.Jenkinsfile

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,9 @@ assert params.VERSION != ""
9696
def newBuildID = params.VERSION
9797
def basearch = params.ARCH
9898

99+
// matches between build/build-arch job
100+
def timeout_mins = 240
101+
99102
// release lock: we want to block the release job until we're done.
100103
// ideally we'd lock this from the main pipeline and have lock ownership
101104
// transferred to us when we're triggered. in practice, it's very unlikely the
@@ -104,7 +107,7 @@ lock(resource: "release-${params.VERSION}-${basearch}") {
104107
// build lock: we don't want multiple concurrent builds for the same stream and
105108
// arch (though this should work fine in theory)
106109
lock(resource: "build-${params.STREAM}-${basearch}") {
107-
timeout(time: 240, unit: 'MINUTES') {
110+
timeout(time: timeout_mins, unit: 'MINUTES') {
108111
cosaPod(cpu: "${ncpus}",
109112
memory: "${cosa_memory_request_mb}Mi",
110113
image: cosa_controller_img,

jobs/build.Jenkinsfile

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,8 +109,20 @@ currentBuild.description = "${build_description} Waiting"
109109
// declare these early so we can use them in `finally` block
110110
def newBuildID, basearch
111111

112+
// matches between build/build-arch job
113+
def timeout_mins = 240
114+
115+
if (params.WAIT_FOR_RELEASE_JOB) {
116+
// Waiting for the release job effectively means waiting for all the build-
117+
// arch jobs we trigger to finish. While we do overlap in execution (by
118+
// a lot when EARLY_ARCH_JOBS is set), let's just simplify and add its
119+
// timeout value to ours to account for this. Add 30 minutes more for the
120+
// release job itself.
121+
timeout_mins += timeout_mins + 30
122+
}
123+
112124
lock(resource: "build-${params.STREAM}") {
113-
timeout(time: 240, unit: 'MINUTES') {
125+
timeout(time: timeout_mins, unit: 'MINUTES') {
114126
cosaPod(cpu: "${ncpus}",
115127
memory: "${cosa_memory_request_mb}Mi",
116128
image: cosa_img,

libcloud.groovy

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -304,6 +304,9 @@ def upload_to_clouds(pipecfg, basearch, buildID, stream) {
304304
}
305305

306306
// Run the resulting set of uploaders in parallel
307-
parallel uploaders
307+
// It shouldn't take more than 45 minutes.
308+
timeout(time: 45, unit: 'MINUTES') {
309+
parallel uploaders
310+
}
308311
}
309312
return this

0 commit comments

Comments
 (0)