Skip to content

Commit e585454

Browse files
authored
feat(publish-image*): Add cosign retries (#83)
We are seeing infrequent cosign failures. Most of them are caused by external the Fulcio service being unavailable for short bursts of time. To make the publish-image and publish-image-index-manifest actions more resilient against this flakiness, this PR introduces a generic retry script which will run the specified command N times with a timeout between individual retries. By default, the cosign commands are retried 3 times with a timeout of 30 seconds. These values can be overridden by the the following action inputs: - cosign-retries - cosign-retry-timeout
1 parent 9a70678 commit e585454

File tree

5 files changed

+85
-19
lines changed

5 files changed

+85
-19
lines changed

.scripts/actions/retry.sh

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
#!/usr/bin/env bash
2+
3+
# Retries any command passed to this script a maximum number of $CMD_RETRIES and
4+
# wait $CMD_TIMEOUT between each try. If the command failed $CMD_RETRIES, this
5+
# script will return with exit code 1.
6+
for TRY in $(seq 1 "$CMD_RETRIES"); do
7+
"$@"
8+
9+
EXIT_CODE=$?
10+
# If command ran successfully, exit the loop
11+
if [ $EXIT_CODE -eq 0 ]; then
12+
break
13+
fi
14+
15+
echo "Command failed $TRY time(s)"
16+
17+
# Exit if we reached the number if retries and the command didn't run successfully
18+
if [ "$TRY" == "$CMD_RETRIES" ]; then
19+
echo "Exiting"
20+
exit 1
21+
fi
22+
23+
echo "Waiting for $CMD_TIMEOUT to try again"
24+
sleep "$CMD_TIMEOUT"
25+
done

publish-image-index-manifest/README.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,12 +15,16 @@ This action creates an image index manifest, publishes it, and signs it. It does
1515
1616
### Inputs
1717

18-
- `image-registry-uri`(eg: `oci.stackable.tech`)
19-
- `image-registry-username` (required)
20-
- `image-registry-password` (required)
21-
- `image-repository` (eg: `stackable/kafka`)
22-
- `image-index-manifest-tag` (eg: `3.4.1-stackable0.0.0-dev`)
23-
- `image-architectures` (defaults to `["amd64", "arm64"]`)
18+
| Input | Required (Default) | Description |
19+
| -------------------------- | ------------------------- | ----------------------------------------------------------------------------------- |
20+
| `image-registry-uri` | Yes | The image registry URI, eg `oci.stackable.tech` |
21+
| `image-registry-username` | Yes | The username used to access the image registry |
22+
| `image-registry-password` | Yes | The password used to access the image registry |
23+
| `image-repository` | Yes | The path to the image, eg `sdp/kafka` |
24+
| `image-index-manifest-tag` | Yes | Human-readable tag without architecture information, eg `3.4.1-stackable0.0.0-dev` |
25+
| `image-architectures` | No (`["amd64", "arm64"]`) | The list of architectures the to-bo-published image was built for |
26+
| `cosign-retries` | No (3) | The number of times cosign operations should be retried |
27+
| `cosign-retry-timeout` | No (30s) | Duration to wait before a new cosign operation is retried, format: `NUMBER[SUFFIX]` |
2428

2529
### Outputs
2630

publish-image-index-manifest/action.yaml

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,15 @@ inputs:
2828
["amd64", "arm64", "riscv"]
2929
default: |
3030
["amd64", "arm64"]
31+
cosign-retries:
32+
description: The number of times cosign operations should be retried
33+
default: "3"
34+
cosign-retry-timeout:
35+
description: |
36+
Duration to wait before a new cosign operation is retried, format: `NUMBER[SUFFIX]`.
37+
SUFFIX may be 's' for seconds (the default), 'm' for minutes, 'h' for hours or 'd' for days.
38+
See `sleep --help` for the full details.
39+
default: "30s"
3140
outputs:
3241
image-index-uri:
3342
description: The Image Index URI.
@@ -97,6 +106,8 @@ runs:
97106
- name: Sign Image Index Manifest
98107
shell: bash
99108
env:
109+
CMD_TIMEOUT: ${{ inputs.cosign-retry-timeout }}
110+
CMD_RETRIES: ${{ inputs.cosign-retries }}
100111
IMAGE_REPOSITORY: ${{ inputs.image-repository }}
101112
REGISTRY_URI: ${{ inputs.image-registry-uri }}
102113
run: |
@@ -112,4 +123,4 @@ runs:
112123
# This generates a signature and publishes it to the registry, next to
113124
# the image. This step uses the keyless signing flow with Github Actions
114125
# as the identity provider.
115-
cosign sign --yes "$IMAGE_REPO_DIGEST"
126+
"$GITHUB_ACTION_PATH/../.scripts/actions/retry.sh" cosign sign --yes "$IMAGE_REPO_DIGEST"

publish-image/README.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -26,12 +26,16 @@ following work:
2626
2727
### Inputs
2828

29-
- `image-registry-uri` (eg: `oci.stackable.tech`)
30-
- `image-registry-username` (required)
31-
- `image-registry-password` (required)
32-
- `image-repository` (eg: `stackable/kafka`)
33-
- `image-manifest-tag` (eg: `3.4.1-stackable0.0.0-dev-amd64`)
34-
- `source-image-uri` (eg: `localhost/kafka:3.4.1-stackable0.0.0-dev-amd64`)
29+
| Input | Required (Default) | Description |
30+
| ------------------------- | ------------------ | ------------------------------------------------------------------------------------- |
31+
| `image-registry-uri` | Yes | The image registry URI, eg `oci.stackable.tech` |
32+
| `image-registry-username` | Yes | The username used to access the image registry |
33+
| `image-registry-password` | Yes | The password used to access the image registry |
34+
| `image-repository` | Yes | The path to the image, eg `sdp/kafka` |
35+
| `image-manifest-tag` | Yes | Human-readable tag with architecture information, eg `3.4.1-stackable0.0.0-dev-amd64` |
36+
| `source-image-uri` | Yes | The source image uri, which gets re-tagged by this action, eg `localhost/kafka:...` |
37+
| `cosign-retries` | No (3) | The number of times cosign operations should be retried |
38+
| `cosign-retry-timeout` | No (30s) | Duration to wait before a new cosign operation is retried, format: `NUMBER[SUFFIX]` |
3539

3640
### Outputs
3741

publish-image/action.yaml

Lines changed: 28 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,15 @@ inputs:
4242
The source image uri, which gets re-tagged by this action to be pushed to
4343
the appropriate registry.
4444
required: true
45+
cosign-retries:
46+
description: The number of times cosign operations should be retried
47+
default: "3"
48+
cosign-retry-timeout:
49+
description: |
50+
Duration to wait before a new cosign operation is retried, format: `NUMBER[SUFFIX]`.
51+
SUFFIX may be 's' for seconds (the default), 'm' for minutes, 'h' for hours or 'd' for days.
52+
See `sleep --help` for the full details.
53+
default: "30s"
4554
runs:
4655
using: composite
4756
steps:
@@ -90,17 +99,22 @@ runs:
9099
91100
- name: Sign the container image (${{ env.IMAGE_REPO_DIGEST }})
92101
shell: bash
102+
env:
103+
CMD_TIMEOUT: ${{ inputs.cosign-retry-timeout }}
104+
CMD_RETRIES: ${{ inputs.cosign-retries }}
93105
run: |
94106
set -euo pipefail
95107
96108
# This generates a signature and publishes it to the registry, next to
97109
# the image. This step uses the keyless signing flow with Github Actions
98110
# as the identity provider.
99-
cosign sign --yes "${IMAGE_REPO_DIGEST}"
111+
"$GITHUB_ACTION_PATH/../.scripts/actions/retry.sh" cosign sign --yes "${IMAGE_REPO_DIGEST}"
100112
101113
- name: Generate SBOM for the container image (${{ env.IMAGE_REPO_DIGEST }})
102114
shell: bash
103115
env:
116+
CMD_TIMEOUT: ${{ inputs.cosign-retry-timeout }}
117+
CMD_RETRIES: ${{ inputs.cosign-retries }}
104118
IMAGE_MANIFEST_TAG: ${{ inputs.image-manifest-tag }}
105119
IMAGE_REPOSITORY: ${{ inputs.image-repository }}
106120
REGISTRY_URI: ${{ inputs.image-registry-uri }}
@@ -141,8 +155,15 @@ runs:
141155
# Merge SBOM components using https://github.com/stackabletech/mergebom
142156
curl --fail -L -o mergebom https://repo.stackable.tech/repository/packages/mergebom/stable-$(uname -m)
143157
curl --fail -L -o mergebom_signature.bundle https://repo.stackable.tech/repository/packages/mergebom/stable-$(arch)_signature.bundle
158+
144159
# Verify signature
145-
cosign verify-blob --certificate-identity 'https://github.com/stackabletech/mergebom/.github/workflows/build_binary.yaml@refs/heads/main' --certificate-oidc-issuer https://token.actions.githubusercontent.com --bundle mergebom_signature.bundle mergebom
160+
"$GITHUB_ACTION_PATH/../.scripts/actions/retry.sh" cosign verify-blob \
161+
--certificate-identity 'https://github.com/stackabletech/mergebom/.github/workflows/build_binary.yaml@refs/heads/main' \
162+
--certificate-oidc-issuer https://token.actions.githubusercontent.com \
163+
--bundle mergebom_signature.bundle \
164+
mergebom
165+
166+
# Run mergebom
146167
chmod +x ./mergebom
147168
./mergebom sbom_raw.json sbom.json
148169
@@ -167,7 +188,8 @@ runs:
167188
} * .[0]' sbom.json > sbom.merged.json
168189
169190
# Attest the SBOM to the image
170-
cosign attest \
171-
--yes \
172-
--predicate sbom.merged.json \
173-
--type cyclonedx "${IMAGE_REPO_DIGEST}"
191+
"$GITHUB_ACTION_PATH/../.scripts/actions/retry.sh" cosign attest \
192+
--yes \
193+
--predicate sbom.merged.json \
194+
--type cyclonedx \
195+
"${IMAGE_REPO_DIGEST}"

0 commit comments

Comments
 (0)