Skip to content

Compute job matrix distribution using extbuild matrix#321

Merged
samansmink merged 23 commits intoduckdb:mainfrom
smvv:try-event-path
Feb 19, 2026
Merged

Compute job matrix distribution using extbuild matrix#321
samansmink merged 23 commits intoduckdb:mainfrom
smvv:try-event-path

Conversation

@smvv
Copy link
Contributor

@smvv smvv commented Feb 16, 2026

Context

A few problems with extension CI tools:

  1. Ideate about how to test and iterate on this workflow.

    • Idea: Generate a list of commands to run (locally or in CI) instead of hardcoding the commands in the reusable workflow YAML file.
  2. Expensive to run most CI jobs.

    • Idea: Dynamically compute matrix based on CI run environment.

As an initial step, create a small CLI tool "extbuild" to iterating quickly on these problems. For now, it can compute the job matrix. Later, we can extend the tool to generate the command list used.

Extbuild: CLI for building extensions

Usage: Cross-platform build matrix

extbuild matrix \
--platform "linux,osx,windows,wasm" \
--out $GITHUB_OUTPUT

Returns platform matrix for linux,osx,windows × amd64,arm64 and wasm × eh,mvp,threads.

The platform and arch values come from the config file config/distribution_matrix.json (by default).

The output matrix is written:

  • to the console for human readers.
  • (optionally) to --out as github output variables (thus usable for job.strategy.matrix).

Development

Install dependencies:

brew install go

Build and test extbuild:

make -C scripts/extbuild build test -sj4

The target build creates ./scripts/extbuild/build/extbuild. See ./scripts/extbuild/build/extbuild --help for local usage.

Rollout plan

The job matrix output can be slightly different from the Python script. The current rollout plan is:

  1. Enable the tool in CI, compute the matrix, but use the old script's output.
  2. Use the new tool's output in CI.
  3. Remove the old script and the comparison steps.

Warn for job matrix differences

When the job matrix differs, it adds a github warning annotation:

Screenshot 2026-02-18 at 07 54 38

(the step error is listed here as well, but not stopping the run due to the continue-on-error.)

and a job summary that is visible on the "run summary":
Screenshot 2026-02-18 at 07 53 41

@smvv smvv changed the title Try to dump event path in CI Compute job matrix distribution using extbuild matrix Feb 18, 2026
@smvv smvv requested review from carlopi and samansmink and removed request for carlopi February 18, 2026 08:35
@smvv smvv marked this pull request as ready for review February 18, 2026 08:35
echo "=== ${platform} matrix diff (extbuild vs python) ==="
if ! diff -u \
<(jq -S 'walk(if type == "object" then del(.opt_in, .run_in_reduced_ci_mode) else . end)' "build/extbuild_ref/${platform}_matrix.json") \
<(jq -S 'walk(if type == "object" then del(.opt_in, .run_in_reduced_ci_mode) else . end)' "build/python_ref/${platform}_matrix.json"); then
Copy link
Contributor Author

@smvv smvv Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: This strips the json fields opt_in and run_in_reduced_ci_mode to only check the differences for the relevant JSON fields.

(The extbuild matrix command removes those fields from its JSON output lines, and thus would be missing otherwise in the diff.)

},
}

cmd.Flags().StringVar(&inputPath, "input", "config/distribution_matrix.json", "Input distribution matrix JSON file")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: i think it's useful to rename this flag to matrix because there could be another input file perhaps in the future (not sure, but input sounds vague, for now). WDYT?

func TestComputePlatformMatrices(t *testing.T) {
t.Parallel()

inputJSON := `{
Copy link
Contributor Author

@smvv smvv Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: this JSON string looks copied but has a reason: it makes the test stable and not dependent on the content of config/distribution_matrix.json file.

with open(output_json_file_path, "w") as output_json_file:
if filtered_data:
json.dump(filtered_data, output_json_file, indent=indent)
json.dump(filtered_data, output_json_file, indent=indent)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: always print {} instead of empty file.

Copy link
Collaborator

@carlopi carlopi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments, but this is solid, thanks!

A blocker for merging this it's branching (either as v1.5.0 or v1.5-variegata) a viable target for duckdb. Or anyhow deciding how to move there, but we can sort that out on a side.

exit 1
fi
- id: parse-matrices
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just do without this whole step?

Basically having a:

cp build/python_ref/* build/.

?
Idea is that we avoid duplication, stuff it's done once instead of twice, this would allow to limit weird copy paste issues (expecially if one were to modify this.

Or the other way around, where stuff it's first produced in build/, then copied to build/python_ref/ and compared, then business as usual.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obv it's not "could", but "would it be better?"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: the comparison and older Python script CI steps are temporary and removed later after merging this PR. Therefore, we did not bother with changing the steps further.

Comment on lines +1 to +6
GO_TEST_FLAGS ?= -coverprofile=coverage.out -timeout 2s

.PHONY: test cov build clean

test:
go test ${GO_TEST_FLAGS} ./...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a top level makefile that includes this one?
Or at least, allow for that to be possible?

If so, then it would be handy to prepend like extbuild-build, extbuild-test and so.

For example, we have recently added a way to do:

make vcpkg-setup

in httpfs (https://github.com/duckdb/duckdb-httpfs?tab=readme-ov-file#building--loading-the-extension)

Idea is that httpfs includes in its own top level Makefile extension-ci-tools provided one, but for that it's handy if the commands are scoped.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe a different Makefile could be offered, that calls the commands with right path, unsure, we can chat live.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would be useful! I think it also raises the high-level question on how we should distribute the extbuild binary to engineers. (Idea: a custom homebrew tap, precompiled binaries on github, and a github action to install it easily in CI.)

- name: Compute extension build matrix
continue-on-error: true
run: |
make -C extension-ci-tools/scripts/extbuild test build -sj4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should testing the tool also happen outside of this workflow?

Basically have a separate workflow that just does:

  • setup go
  • build & test extbuild

That could happen on a side, possibly triggered only to changes in the relevant folder.

Copy link
Collaborator

@carlopi carlopi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think on my side this looks nice to have, and a good starting point to build on.

I think there is no blocker in merging this from my side.

@samansmink samansmink merged commit 0de7d1d into duckdb:main Feb 19, 2026
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants