Skip to content

Set BUILDKITE_CANCEL_SIGNAL to SIGQUIT for build and test#366

Merged
kodiakhq[bot] merged 10 commits intomainfrom
jn/cancel-signal
Jan 9, 2026
Merged

Set BUILDKITE_CANCEL_SIGNAL to SIGQUIT for build and test#366
kodiakhq[bot] merged 10 commits intomainfrom
jn/cancel-signal

Conversation

@vtjnash
Copy link
Member

@vtjnash vtjnash commented Jun 20, 2024

No description provided.

@vtjnash vtjnash requested a review from staticfloat June 20, 2024 18:13
staticfloat
staticfloat previously approved these changes Jun 20, 2024
@vtjnash
Copy link
Member Author

vtjnash commented Jun 20, 2024

Observe here that I intentionally canceled this build, so that it would print SIGQUIT (which it successfully did). Though looks like no upload task then was able to get the core files (the timeout for uploading the core files after cancellation occurs is set to 10 seconds by default, unless changed in the worker configuration)
https://buildkite.com/julialang/julia-buildkite/builds/1556#019036ea-473f-436a-9633-b3828fe142e2

@staticfloat staticfloat force-pushed the jn/cancel-signal branch 2 times, most recently from f0e4bc8 to 94cdb9d Compare June 22, 2024 05:07
@vtjnash

This comment was marked as outdated.

@staticfloat
Copy link
Member

Looks to me like it's still not working.

@vtjnash
Copy link
Member Author

vtjnash commented Jun 24, 2024

Does it need an updated copy of the sandbox binary with JuliaContainerization/Sandbox.jl@269e91f?

@vtjnash
Copy link
Member Author

vtjnash commented Jun 24, 2024

At line 581, ABORTED (32: Broken pipe)!

This seems to indicate the running version is quite old?

@staticfloat
Copy link
Member

This seems to indicate the running version is quite old?

Indeed, there were more layers of sandboxing than I remembered :P

@vtjnash
Copy link
Member Author

vtjnash commented Jun 24, 2024

well that is entertaining on musl:

$ /etc/buildkite-agent/plugins/github-com-JuliaCI-coreupload-buildkite-plugin-sf-error-reporting/hooks/post-command
2024-06-24 18:00:37 EDT	ERROR: Cannot find 'file', needed to parse executable paths out of corefiles!!
2024-06-24 18:00:37 EDT	🚨 Error: The plugin coreupload post-command hook exited with status 1

@staticfloat
Copy link
Member

Aha, my extra debugging information coming in handy! Here's a fix for that minor issue: JuliaCI/rootfs-images#252

@staticfloat
Copy link
Member

I tried canceling one of the test jobs, and it looks to me like the new sandbox executable segfaulted: https://buildkite.com/julialang/julia-buildkite/builds/1577#01904c26-78c2-4120-bdcc-d01a439b5bfa/1373-1378

@vtjnash
Copy link
Member Author

vtjnash commented Jun 25, 2024

It seems more likely to me that the child segfaulted and then the sandbox killed itself with the same exit code. Seems like the SIGQUIT got propagated though, which looked pretty good. Not sure why it then segfaulted and did not coredump, but maybe that is a quirk of rr? Or is the ulimit -c potentially disabled?

@staticfloat
Copy link
Member

Or is the ulimit -c potentially disabled?

ulimit -c unlimited

@vtjnash
Copy link
Member Author

vtjnash commented Jun 25, 2024

@staticfloat
Copy link
Member

The segfault I linked was a test job.

@vtjnash
Copy link
Member Author

vtjnash commented Jun 25, 2024

Right, but it was an rr job too

@staticfloat
Copy link
Member

Right, so for rr jobs, we do not run ulimit -c unlimited, which means that core files are disabled. I believe we do that because rr already uploads traces that are better than core dumps anyway.

@DilumAluthge
Copy link
Member

Would it hurt to just upload core dumps unconditionally (for both rr jobs and non-rr jobs)?

@DilumAluthge
Copy link
Member

Would it hurt to just upload core dumps unconditionally (for both rr jobs and non-rr jobs)?

#368

@staticfloat
Copy link
Member

It's just wasted compute time and storage space. There's really no information in a coredump that an rr trace does not have.

@vtjnash
Copy link
Member Author

vtjnash commented Aug 8, 2024

@staticfloat can you change DROPME: Use sandbox#main to use sandbox@v2 and merge this? CI is green now

@vtjnash
Copy link
Member Author

vtjnash commented Aug 11, 2024

Looks like sandbox needs that branch pushed to correspond to the release tags? And a rebase here

@vtjnash
Copy link
Member Author

vtjnash commented Sep 18, 2024

Needs resigning, and merge?

@vtjnash
Copy link
Member Author

vtjnash commented Oct 9, 2025

Assuming I updated correctly the SHA hash for the new rootfs versions, this may be still ready to re-sign and merge?

@IanButterworth
Copy link
Member

Rebased and resigned. Automerge?

@DilumAluthge
Copy link
Member

It's been a minute, so I don't remember what the most recent state was here. @vtjnash

@vtjnash
Copy link
Member Author

vtjnash commented Jan 8, 2026

It has been ready, as long as each platform base image now has correct support

@DilumAluthge
Copy link
Member

Alright, CI is all green now.

@vtjnash Can I merge this?

@vtjnash
Copy link
Member Author

vtjnash commented Jan 9, 2026

It is good from my perspective

@DilumAluthge DilumAluthge added the automerge Kodiak will auto-merge this PR once all CI is green. (This label was previously called "merge me".) label Jan 9, 2026
@kodiakhq kodiakhq bot merged commit f7c7f5b into main Jan 9, 2026
9 checks passed
@kodiakhq kodiakhq bot deleted the jn/cancel-signal branch January 9, 2026 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automerge Kodiak will auto-merge this PR once all CI is green. (This label was previously called "merge me".)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants