Compute job matrix distribution using extbuild matrix#321
Compute job matrix distribution using extbuild matrix#321samansmink merged 23 commits intoduckdb:mainfrom
Conversation
| echo "=== ${platform} matrix diff (extbuild vs python) ===" | ||
| if ! diff -u \ | ||
| <(jq -S 'walk(if type == "object" then del(.opt_in, .run_in_reduced_ci_mode) else . end)' "build/extbuild_ref/${platform}_matrix.json") \ | ||
| <(jq -S 'walk(if type == "object" then del(.opt_in, .run_in_reduced_ci_mode) else . end)' "build/python_ref/${platform}_matrix.json"); then |
There was a problem hiding this comment.
note: This strips the json fields opt_in and run_in_reduced_ci_mode to only check the differences for the relevant JSON fields.
(The extbuild matrix command removes those fields from its JSON output lines, and thus would be missing otherwise in the diff.)
| }, | ||
| } | ||
|
|
||
| cmd.Flags().StringVar(&inputPath, "input", "config/distribution_matrix.json", "Input distribution matrix JSON file") |
There was a problem hiding this comment.
note: i think it's useful to rename this flag to matrix because there could be another input file perhaps in the future (not sure, but input sounds vague, for now). WDYT?
| func TestComputePlatformMatrices(t *testing.T) { | ||
| t.Parallel() | ||
|
|
||
| inputJSON := `{ |
There was a problem hiding this comment.
note: this JSON string looks copied but has a reason: it makes the test stable and not dependent on the content of config/distribution_matrix.json file.
| with open(output_json_file_path, "w") as output_json_file: | ||
| if filtered_data: | ||
| json.dump(filtered_data, output_json_file, indent=indent) | ||
| json.dump(filtered_data, output_json_file, indent=indent) |
There was a problem hiding this comment.
note: always print {} instead of empty file.
carlopi
left a comment
There was a problem hiding this comment.
Left a few comments, but this is solid, thanks!
A blocker for merging this it's branching (either as v1.5.0 or v1.5-variegata) a viable target for duckdb. Or anyhow deciding how to move there, but we can sort that out on a side.
| exit 1 | ||
| fi | ||
| - id: parse-matrices |
There was a problem hiding this comment.
Could we just do without this whole step?
Basically having a:
cp build/python_ref/* build/.
?
Idea is that we avoid duplication, stuff it's done once instead of twice, this would allow to limit weird copy paste issues (expecially if one were to modify this.
Or the other way around, where stuff it's first produced in build/, then copied to build/python_ref/ and compared, then business as usual.
There was a problem hiding this comment.
Obv it's not "could", but "would it be better?"
There was a problem hiding this comment.
note: the comparison and older Python script CI steps are temporary and removed later after merging this PR. Therefore, we did not bother with changing the steps further.
| GO_TEST_FLAGS ?= -coverprofile=coverage.out -timeout 2s | ||
|
|
||
| .PHONY: test cov build clean | ||
|
|
||
| test: | ||
| go test ${GO_TEST_FLAGS} ./... |
There was a problem hiding this comment.
Should there be a top level makefile that includes this one?
Or at least, allow for that to be possible?
If so, then it would be handy to prepend like extbuild-build, extbuild-test and so.
For example, we have recently added a way to do:
make vcpkg-setup
in httpfs (https://github.com/duckdb/duckdb-httpfs?tab=readme-ov-file#building--loading-the-extension)
Idea is that httpfs includes in its own top level Makefile extension-ci-tools provided one, but for that it's handy if the commands are scoped.
There was a problem hiding this comment.
Or maybe a different Makefile could be offered, that calls the commands with right path, unsure, we can chat live.
There was a problem hiding this comment.
that would be useful! I think it also raises the high-level question on how we should distribute the extbuild binary to engineers. (Idea: a custom homebrew tap, precompiled binaries on github, and a github action to install it easily in CI.)
| - name: Compute extension build matrix | ||
| continue-on-error: true | ||
| run: | | ||
| make -C extension-ci-tools/scripts/extbuild test build -sj4 |
There was a problem hiding this comment.
Should testing the tool also happen outside of this workflow?
Basically have a separate workflow that just does:
- setup go
- build & test extbuild
That could happen on a side, possibly triggered only to changes in the relevant folder.
carlopi
left a comment
There was a problem hiding this comment.
Thanks! I think on my side this looks nice to have, and a good starting point to build on.
I think there is no blocker in merging this from my side.
Context
A few problems with extension CI tools:
Ideate about how to test and iterate on this workflow.
Expensive to run most CI jobs.
As an initial step, create a small CLI tool "extbuild" to iterating quickly on these problems. For now, it can compute the job matrix. Later, we can extend the tool to generate the command list used.
Extbuild: CLI for building extensions
Usage: Cross-platform build matrix
Returns platform matrix for
linux,osx,windows×amd64,arm64andwasm×eh,mvp,threads.The platform and arch values come from the config file
config/distribution_matrix.json(by default).The output matrix is written:
--outas github output variables (thus usable forjob.strategy.matrix).Development
Install dependencies:
Build and test
extbuild:make -C scripts/extbuild build test -sj4The target
buildcreates./scripts/extbuild/build/extbuild. See./scripts/extbuild/build/extbuild --helpfor local usage.Rollout plan
The job matrix output can be slightly different from the Python script. The current rollout plan is:
Warn for job matrix differences
When the job matrix differs, it adds a github warning annotation:
(the step error is listed here as well, but not stopping the run due to the
continue-on-error.)and a job summary that is visible on the "run summary":
