|
| 1 | +<!-- |
| 2 | +Copyright The Shipwright Contributors |
| 3 | +
|
| 4 | +SPDX-License-Identifier: Apache-2.0 |
| 5 | +--> |
| 6 | + |
| 7 | +--- |
| 8 | +title: multi-arch-image-builds |
| 9 | +authors: |
| 10 | + - "@adambkaplan" |
| 11 | +reviewers: |
| 12 | + - TBD |
| 13 | +approvers: |
| 14 | + - TBD |
| 15 | +creation-date: 2025-05-21 |
| 16 | +last-updated: 2025-08-19 |
| 17 | +status: implementable |
| 18 | +see-also: |
| 19 | + - "ships/0039-build-scheduler-opts.md" |
| 20 | +replaces: |
| 21 | + - https://github.com/shipwright-io/community/pull/275 |
| 22 | +superseded-by: [] |
| 23 | +--- |
| 24 | + |
| 25 | +# SHIP-0043: Multi-arch Image Builds |
| 26 | + |
| 27 | +## Release Signoff Checklist |
| 28 | + |
| 29 | +- [x] Enhancement is `implementable` |
| 30 | +- [x] Design details are appropriately documented from clear requirements |
| 31 | +- [x] Test plan is defined |
| 32 | +- [ ] Graduation criteria for dev preview, tech preview, GA |
| 33 | +- [ ] User-facing documentation is created in [docs](/docs/) |
| 34 | + |
| 35 | +## Open Questions [optional] |
| 36 | + |
| 37 | +TBD |
| 38 | + |
| 39 | +## Summary |
| 40 | + |
| 41 | +This extends Shipwright to orchestrate multi architecture container image builds. It aims to solve |
| 42 | +the following challenges: |
| 43 | + |
| 44 | +* Scheduling builds on native Kubernetes nodes for a given os + architecture. |
| 45 | +* Providing a parameter to tools that support multi-arch builds through emulation or |
| 46 | + cross-compilation. |
| 47 | + |
| 48 | +## Motivation |
| 49 | + |
| 50 | +### Background |
| 51 | + |
| 52 | +#### OCI Image Indexes |
| 53 | + |
| 54 | +The Open Container Initiative (OCI) provides the industry standards for container image |
| 55 | +specifications and formats. It is the successor to Docker’s “v2” specification for container |
| 56 | +images, and is designed to be backwards compatible. The specification includes an |
| 57 | +[“image index” standard](https://github.com/opencontainers/image-spec/blob/main/image-index.md#image-index-property-descriptions) |
| 58 | +for containers that can be run on multiple CPU and operating system architectures. This is |
| 59 | +equivalent to the Docker v2 “manifest list,” and the two terms are used interchangeably. For |
| 60 | +consistency in this proposal, “image index” will be used moving forward. |
| 61 | + |
| 62 | +#### Multi-Arch Worker Nodes |
| 63 | + |
| 64 | +Many Kubernetes distributions - starting with v1.30 and perhaps earlier - allow clusters to have |
| 65 | +worker nodes with different OS and CPU architectures. Clusters expose the node OS and CPU |
| 66 | +architecture through [default node labels](https://kubernetes.io/docs/reference/node/node-labels/). |
| 67 | +Shipwright began accomodating these scenarios with |
| 68 | +[SHIP-0039](https://github.com/shipwright-io/community/blob/main/ships/0039-build-scheduler-opts.md), |
| 69 | +whose features were incrementally released in Builds v0.14 and v0.15. However, these features only |
| 70 | +let developers create single images for a single architecture. Creating an image index for multiple |
| 71 | +architectures requires significant orchestration effort outside of Shipwright. |
| 72 | + |
| 73 | +#### Multi-Arch Capabilities in Build Toolchains |
| 74 | + |
| 75 | +Many popular container build tools, such as `buildkit`, `buildah`, |
| 76 | +[cloud native buildpacks](https://buildpacks.io/docs/for-app-developers/how-to/special-cases/build-for-arm/), |
| 77 | +and `ko` support multi-arch builds through cross-compilation or qemu-style CPU emulation. These |
| 78 | +tools often expose a `--platform` command line option to build the image with a different os + |
| 79 | +architecture than the underlying host. The industry appears to have standardized on the `<GOOS>/<GOARCH>` |
| 80 | +naming conventions for “platform” (ex: `linux/amd64`). These are identical to the namings used for |
| 81 | +Kubernetes node labels. |
| 82 | + |
| 83 | +Support for generating an OCI image index varies by tool. Some - like ko and buildah - do provide |
| 84 | +support for creating image indexes. These typically require the build to run in the same process; |
| 85 | +“fan out” support to run these builds in parallel is typically not supported or is more challenging |
| 86 | +to set up in a containerized environment (ex: `podman farm` command). |
| 87 | + |
| 88 | +### Goals |
| 89 | + |
| 90 | +* Provide a mechanism for developers to request a multi-arch image build, or build for a specific |
| 91 | + OS + CPU architecture. |
| 92 | + |
| 93 | +### Non-Goals |
| 94 | + |
| 95 | +* Generalized matrix builds for Shipwright. This is a capability provided by Tekton. |
| 96 | +* Adding retries for failed builds. This is out of scope to simplify the design. Such a feature can |
| 97 | + be considered in a follow-up enhancement. |
| 98 | +* Scheduling builds on nodes with specialized hardware (ex: GPUs). This is already supported in |
| 99 | + Shipwright through the scheduler options in v0.15. |
| 100 | +* Management of container resources (CPU, memory) at the Build/BuildRun level. At present, these |
| 101 | + can only be defined at the ClusterBuildStrategy/BuildStrategy level. See |
| 102 | + [build#1894](https://github.com/shipwright-io/build/issues/1894). |
| 103 | + |
| 104 | +## Proposal |
| 105 | + |
| 106 | +### User Stories |
| 107 | + |
| 108 | +- As a developer, I want to build containers for x86 and ARM so I can share my app with my team |
| 109 | + using different CPU architectures (Apple Silicon vs. Windows x86) |
| 110 | +- As a cluster admin, I want multi-arch builds to be scheduled on native nodes if my Kubernetes |
| 111 | + cluster has multiple CPU architecture worker nodes. |
| 112 | +- As a platform engineer, I want to provide a standard way for my teams to run multi-arch container |
| 113 | + builds. |
| 114 | + |
| 115 | +### Implementation Notes |
| 116 | + |
| 117 | +#### “Image Platform” Concept |
| 118 | + |
| 119 | +An image platform is the combination of operating system (“os”), CPU architecture (“arch”), and |
| 120 | +other container image “platform” attributes as defined in the OCI [Image Index specification](https://github.com/opencontainers/image-spec/blob/main/image-index.md#image-index-property-descriptions). |
| 121 | +Developers can specify the desired platform(s) for a container image build as a JSON/YAML object |
| 122 | +with the following attributes: |
| 123 | + |
| 124 | +- `os`: operating system. Required |
| 125 | +- `arch`: CPU architecture. Required |
| 126 | + |
| 127 | +The JSON/YAML representation is intended to future-proof the API for additional “features” defined |
| 128 | +in the OCI image index spec, or other items that Shipwright can support at a later date (ex: cpu |
| 129 | +arch variant, os.version). |
| 130 | + |
| 131 | +A shorter single-string format for “platform” is not allowed within the Kubernetes YAML. However, |
| 132 | +it can be supported when invoked from the command line (see below). |
| 133 | + |
| 134 | +#### Multi-arch in `Build` and `BuildRun` objects |
| 135 | + |
| 136 | +The `Build` and `BuildRun` APIs will add a new `multiArch` JSON/YAML object to `spec.output`. This |
| 137 | +object will contain the following fields: |
| 138 | + |
| 139 | +- `platforms`: list platforms to build, using the above "image platform" structure. Required. |
| 140 | + |
| 141 | +Below is an example multi-arch Linux image build for x86, ARM, Power, and Z: |
| 142 | + |
| 143 | +```yaml |
| 144 | +apiVersion: shipwright.io/v1beta1 |
| 145 | +kind: Build |
| 146 | +spec: |
| 147 | + ... |
| 148 | + output: |
| 149 | + image: <url> |
| 150 | + multiArch: |
| 151 | + platforms: |
| 152 | + - arch: amd64 |
| 153 | + os: linux |
| 154 | + - arch: s390x |
| 155 | + os: linux |
| 156 | + - arch: arm64 |
| 157 | + os: linux |
| 158 | + - arch: ppcle64 |
| 159 | + os: linux |
| 160 | +``` |
| 161 | +
|
| 162 | +#### `BuildRun` Controller Reconciliation |
| 163 | + |
| 164 | +##### Validations |
| 165 | + |
| 166 | +The following validations should be run if `spec.output.mutliArch.platforms` is not empty. |
| 167 | + |
| 168 | +For each image platform referened in the `platforms` array, the `BuildRun` controller should |
| 169 | +verify that at least one node with the respective `kubernetes.io/os` and `kubernetes.io/arch` label |
| 170 | +value is present. If no such node exists, the controller should set the `BuildRun`'s status to |
| 171 | +failed with an appropriate message and reason code. |
| 172 | + |
| 173 | +The controller should also check that `spec.nodeSelector` does not have any values set for |
| 174 | +`kubernetes.io/os` or `kubernetes.io/arch` in the label matcher. If any value is present, the |
| 175 | +controller should likewise set the `BuildRun`'s status to failed with an appropriate message and |
| 176 | +error code. |
| 177 | + |
| 178 | +##### Tekton `PipelineRun` Generation |
| 179 | + |
| 180 | +If all checks above pass, the `BuildRun` controller will generate a Tekton `PipelineRun` to |
| 181 | +execute the build. This will require significant refactoring of the existing codebase, which |
| 182 | +currently generates a single `TaskRun` that is effectively “single-threaded.” |
| 183 | + |
| 184 | + |
| 185 | + |
| 186 | +The containers in the generated `PipelineRun` will be executed in three phases: |
| 187 | + |
| 188 | +**Phase 1: Obtain Source** |
| 189 | + |
| 190 | +The first phase will gather the source code. The mechanism for the generated `TaskRun` will vary |
| 191 | +depending on the values in spec.source for the Build/BuildRun. |
| 192 | + |
| 193 | +Code from git will invoke the current Shipwright git clone process as a TaskRun with the following |
| 194 | +containers: |
| 195 | + |
| 196 | +- The main “git clone” container that exists today. |
| 197 | +- A second “image push” container, leveraging the existing Shipwright container that supports |
| 198 | + “managed push”. This container will package the source code into an OCI artifact, which is then |
| 199 | + pushed to the same registry as the output image. A tag suffix pattern (-src) will be used to |
| 200 | + ensure the source code artifact is persisted on most image registries. This may require |
| 201 | + significant enhancements to the current “image push” container. |
| 202 | + |
| 203 | +Code from “local source” will likewise invoke a TaskRun as above to receive source code from a |
| 204 | +remote machine. It will push the source code to an OCI artifact, as above. |
| 205 | + |
| 206 | +Code from an OCI artifact will not invoke any TaskRun during this phase. The push of source code to |
| 207 | +the image registry has already been completed. |
| 208 | + |
| 209 | +**Phase 2: Fan Out Builds** |
| 210 | + |
| 211 | +The generated `PipelineRun` will then define a set of Tekton `TaskRuns` that can be run in parallel |
| 212 | +- one for each platform in `spec.output.multiArch.platforms`. Each `TaskRun` will set the appropriate |
| 213 | +`nodeSelector` to schedule the build on a node with the appropriate `kubernetes.io/os` and |
| 214 | +`kubernetes.io/arch` label. |
| 215 | + |
| 216 | +The TaskRun containers will do the following: |
| 217 | + |
| 218 | +- Pull the source code from the referenced OCI artifact. This will utilize existing Shipwright |
| 219 | + logic for pulling source code from OCI artifacts. |
| 220 | +- Execute the build per the referenced build strategy. |
| 221 | +- Push the output container image to the image registry, with the `-<os>-<arch>` tag suffix. |
| 222 | +- Publish the output container image digest as a `TaskRun` result value. |
| 223 | + |
| 224 | +All other fields used to control the build pod definition - such as resources and volumes - will be |
| 225 | +inherited from the parent Build/BuildRun object as they are today. |
| 226 | + |
| 227 | +**Phase 3: Assemble Index Image** |
| 228 | + |
| 229 | +The last phase of the generated `PipelineRun` will create a `TaskRun` that assembles the OCI image |
| 230 | +index, based on the results of the prior build `TaskRuns`. |
| 231 | + |
| 232 | +If any platform failed to build, the PipelineRun should fail and subsequently mark the BuildRun as |
| 233 | +failed. |
| 234 | + |
| 235 | +#### CLI Enhancements |
| 236 | + |
| 237 | +The CLI will add the `--platform` flag to the Build and BuildRun oriented commands. These shall set |
| 238 | +the respective value for `spec.output.multiArch`. The `--platform` option can be set multiple |
| 239 | +times, and will accept platforms in their single-line `<os>/<arch>` format. |
| 240 | + |
| 241 | +Example experience: |
| 242 | + |
| 243 | +```sh |
| 244 | +shp build create sample-go --strategy=source-to-image \ |
| 245 | + --output quay.io/adambkaplan/sample-go:v1 \ |
| 246 | + --platform=linux/amd64 \ |
| 247 | + --platform=linux/arm64 |
| 248 | +
|
| 249 | +shp build run sample-go --platform linux/amd64 \ |
| 250 | + --platform linux/arm64 \ |
| 251 | + --platform linux/s390x |
| 252 | +``` |
| 253 | + |
| 254 | +### Test Plan |
| 255 | + |
| 256 | +Testing will primarily be at the unit level, where the configuration of the Tekton `TaskRun` can be |
| 257 | +verified. Current integration tests use KinD clusters, which can't feasibly mimic a Kubernetes |
| 258 | +cluster with multiple architecture worker nodes. |
| 259 | + |
| 260 | +End-to-end testing is not feasible unless the project obtains access to a real Kubernetes cluster |
| 261 | +with multiple CPU architecture worker nodes. |
| 262 | + |
| 263 | +### Release Criteria |
| 264 | + |
| 265 | +#### Removing a deprecated feature [if necessary] |
| 266 | + |
| 267 | +N/A |
| 268 | + |
| 269 | +#### Upgrade Strategy [if necessary] |
| 270 | + |
| 271 | +The new multiArch field in `Build` and `BuildRun` objects will be optional (fields within it will |
| 272 | +be required). Current builds should work as expected. |
| 273 | + |
| 274 | +### Risks and Mitigations |
| 275 | + |
| 276 | +TBD |
| 277 | + |
| 278 | +> What are the risks of this proposal and how do we mitigate? Think broadly. For example, consider |
| 279 | +> both security and how this will impact the larger Shipwright ecosystem. |
| 280 | + |
| 281 | +> How will security be reviewed and by whom? How will UX be reviewed and by whom? |
| 282 | + |
| 283 | +## Drawbacks |
| 284 | + |
| 285 | +### Verbose API for “Platform” |
| 286 | + |
| 287 | +Developers are used to a single string representation of “platform” - ex “linux/amd64”. This is |
| 288 | +provided through the command line, but not in the YAML. |
| 289 | + |
| 290 | +Using a verbose API in the YAML allows us to future-proof builds with more complex manifest list |
| 291 | +definitions. The OCI image index spec already has fields that are allowed to be defined on image |
| 292 | +index entries which are excluded from this initial API for the sake of simplicity: |
| 293 | + |
| 294 | +- `variant` - some (but not all?) build tools support this. Ex: podman, buildah |
| 295 | +- `os.version` - use is not really observed in the field today. |
| 296 | +- `features` - this is a catch-all for future extensions to the oci image spec. This might be |
| 297 | + relevant for AI workloads if a container image requires specific hardware to execute. |
| 298 | + |
| 299 | +### Limited Mechanism for Multi-arch Builds |
| 300 | + |
| 301 | +This proposal only supports multi-arch builds on nodes with the respective OS and CPU architecture. |
| 302 | +There are other potential mechanisms for building multi-arch container images: |
| 303 | + |
| 304 | +- Use cross-compilation if the programming language/SDK supports it. |
| 305 | +- Use emulation if the container build tool supports it. |
| 306 | +- Execute builds on virtual machines with appropriate OS + CPU architecture. |
| 307 | + |
| 308 | + |
| 309 | +## Alternatives |
| 310 | + |
| 311 | +### String Shorthand for “Platform” in YAML |
| 312 | + |
| 313 | +Many developers who do multi-arch builds with Podman or Buildkit are familiar with a single string |
| 314 | +representation of “platform”. While convenient, this adds additional complexity to the storage and |
| 315 | +serialization of data in Kubernetes. The shorthand also locks the API to a convention that may not |
| 316 | +be universally understood by all build tools. |
| 317 | + |
| 318 | +Using the shorthand in the CLI provides this capability in spirit, and closer to where developers |
| 319 | +directly interact with builds. |
| 320 | + |
| 321 | +### Use Kata Containers for Multi-Arch Builds |
| 322 | + |
| 323 | +Kata Containers and Confidential Containers support a deployment known as |
| 324 | +["peer pods"](https://confidentialcontainers.org/docs/architecture/design-overview/#clouds-and-nesting), |
| 325 | +where a remote virtual machine can be provisioned and managed in Kubernetes just like a pod. This |
| 326 | +can _hypothetically_ be used to securely run builds on machines with different CPU architectures. |
| 327 | +On Kubernetes, Kata peer pods are scheduled by specifying a [runtime class](https://kubernetes.io/docs/concepts/containers/runtime-class/) |
| 328 | +for the pod. |
| 329 | + |
| 330 | +The Konflux CI project experimented with this approach for multi-arch container builds, with |
| 331 | +[mixed results](https://groups.google.com/g/konflux/c/A0m0JWjYwnc/m/mUOSFkAsAwAJ?utm_medium=email&utm_source=footer). |
| 332 | +The proof of concept proved that Tekton can schedule these build pods just fine, however there are |
| 333 | +fundamental challenges scaling build workloads using dynamic virtual machines. Creating VMs and |
| 334 | +required components for Kata containers is also challenging for a given CPU architecture. |
| 335 | + |
| 336 | +Incorporating runtime class features into any multi-arch build capability would add significant |
| 337 | +complexity and may require APIs for cluster administrators (see below). |
| 338 | + |
| 339 | +### Control Multi-Arch Method with New APIs |
| 340 | + |
| 341 | +A previous version of this proposal introduced cluster-level APIs that determined how a multi-arch |
| 342 | +build could execute. This would allow an administrator to specify that some OS/CPU architecture |
| 343 | +builds could be run on natively on respective Kubernetes nodes, whereas others could use |
| 344 | +alternative mechanisms for producing the image for a given OS + CPU arch. These were known as the |
| 345 | +`ImagePlatformClass` APIs. |
| 346 | + |
| 347 | +This feature was primarily motivated by the Kata + Confidential containers approach above. Due to |
| 348 | +the technical challenges identified above, it does not make sense for Shipwright to provide new |
| 349 | +CRDs for this purpose. In addition, the many/many relationship between the `ImagePlatform` and |
| 350 | +`ImagePlatformClass` CRDs may have been challenging to implement. This design can be revisited at a |
| 351 | +later date, should the Kata + Confidential containers approach prove feasible and scalable. |
| 352 | + |
| 353 | +### Tekton Matrix Builds |
| 354 | + |
| 355 | +Tekton has support for matrixed tasks, and work is underway to improve this feature to support |
| 356 | +node selectors and other pod template features. In theory the matrix can be used to run Shipwright |
| 357 | +builds per architecture, especially if we release the Triggers project which provides a "Shipwright |
| 358 | +build in Tekton pipeline" capability. |
| 359 | + |
| 360 | +Our mission is to provide a simplified, opinionated experience for building container images. |
| 361 | +Multi-arch is a clear use case specific to container image builds. Tekton deliberately provides |
| 362 | +general-purpose solutions that emphasize flexibility. Using the matrix tasks feature exposes |
| 363 | +developers to significant amounts of complexity. |
| 364 | + |
| 365 | +## Infrastructure Needed [optional] |
| 366 | + |
| 367 | +None for unit tests. |
| 368 | + |
| 369 | +End to end testing may require a Kubernetes cluster with multiple CPU architecture worker nodes. |
| 370 | + |
| 371 | +## Implementation History |
| 372 | + |
| 373 | +- 2025-05-21: Initial Draft |
| 374 | +- 2025-08-19: Revised draft with narrower scope |
0 commit comments