Harden CI Flow and Checks by aauren · Pull Request #2040 · cloudnativelabs/kube-router

aauren · 2026-03-22T23:41:23Z

CI Security Hardening Pull Request

What type of PR is this?

feature

What this PR does / why we need it:

Hardens the CI pipeline against supply chain attacks and improves release artifact trust. I tried to do this in a way that would balance our small maintainer team and the medium size of this repo with best practices.

This is also necessary / timely because of upcoming EU CRA compliance which requires SBOMs for software used in the EU. While kube-router isn't a sold product, its still probably a good practice to begin adopting.

This PR is chained off of #2035 and is meant to be merged after #2035 is merged and a rebase has been performed.

CI refactor:

Splits the monolithic ci.yml into an orchestrating caller (ci.yml) and three reusable
workflow_call workflows (ci-checks.yml, ci-container.yml, ci-release.yml), plus a local
composite action for the repeated checkout + setup-go steps. All jobs remain sequential via
needs: and PR status checks are fully preserved.
ci-unicode-check has been added to check for any malicious content that might be sent with PRs but are otherwise hidden from reviews using special unicode characters.

Supply chain hardening:

All 14 third-party GitHub Actions pinned by commit SHA with version comments. Existing Dependabot
github-actions config will maintain these automatically.
Base images (golang:1.25.7-alpine3.23, alpine:3.23) pinned by digest in both ci.yml and
the Makefile to ensure identical builds locally and in CI.

Code quality fixes:

CodeQL given an explicit languages: go to prevent silent scan failures if autobuild heuristics
fail

Release artifact trust:

Container images are keyless-signed with cosign (Sigstore) on all tag pushes; signatures logged
in Rekor
SPDX-JSON SBOMs generated for container images and attested to DockerHub via cosign attest —
verifiable directly from the image without visiting GitHub
CycloneDX-JSON SBOM for release binaries attached to each GitHub release
SLSA Build Level 2 provenance for release binaries via actions/attest

CVE scanning:

make scan added to the Makefile (Docker-first, local fallback, same BUILD_IN_DOCKER pattern
as all other targets) using Grype to scan the locally-built container image
make prep-release now includes scan as its final step
.grype.yaml configures only-fixed: true (suppresses Alpine CVEs with no upstream patch) and
ignores the self-referential kube-router finding
Grype was not added to CI after evaluating the trade-offs: Alpine CVEs with no fix available,
newly published transitive dependency CVEs, and complex conditional logic for bugfix vs
new-release branches would create uncontrollable CI failures for a small maintainer team. CodeQL
and Dependabot continue to provide automated coverage. OpenSSF Scorecard is unaffected — no
Scorecard check evaluates whether container image CVE scanning runs in CI.

OpenSSF Scorecard:

New scorecard.yml workflow runs on push to master, weekly, and on branch protection rule changes
Results published to the public Scorecard API and surfaced as code scanning alerts in the Security
tab
README badge added

gotestsum pinned:

GOTESTSUM_VERSION=v1.13.0 added alongside other tool version constants; both @latest
references replaced

Which issue(s) this PR is related to:

N/A

Was AI used during the creation of this PR?

What tool was used: Claude (claude-sonnet-4-6) via OpenCode CLI
To what extent was the tool used? The AI drafted and implemented the entire PR. The human
directed the work, made all architectural decisions, reviewed every phase before committing, and
pushed back on several proposals (e.g. removing Grype from CI, workflow_run vs workflow_call,
env var vs hardcoded versions).
How detailed of a plan? Very detailed — a multi-phase plan with rationale for each tool
selection (including upstream health evaluation of alternatives) was created and reviewed before
implementation began. The plan lives in .plans/CiSecurityHardening/PLAN.md.
Human in the loop? Yes — each phase was paused for human review and an explicit commit before
proceeding. Several AI proposals were rejected or revised based on human judgment.

What, if any, amount of integration testing was done with this change in a Kubernetes environment?

No Kubernetes integration testing — this PR touches only CI workflows, the Makefile, and repository
configuration files. No kube-router runtime behaviour is changed. make scan was validated locally
against an existing built image.

Does this PR introduce a breaking change?

NONE

Anything else the reviewer should know that wasn't already covered?

Permissions model: workflow_call requires permissions to be granted in the caller (ci.yml) —
called workflows cannot self-elevate. The release job explicitly grants contents: write,
id-token: write, and attestations: write. The container job grants id-token: write and
attestations: write. Both are commented in the file.

Digest pins will drift: The base image digests in ci.yml and the Makefile will go stale as
Alpine and Go release patches. These are intentionally pinned for build reproducibility and CVE scan
consistency between local and CI environments. They should be updated as part of normal dependency
maintenance via make update-deps. Dependabot does not currently track env-var image references,
so this is a manual step for now.

First Scorecard run: The score will be zero/unavailable until the workflow runs on master for
the first time. Several checks (e.g. Branch-Protection, Code-Review) depend on repository
settings rather than code and may require separate configuration to improve.

Previously, this was done manually by humans and was therefore not always done consistently. Sometimes dependencies would be missed, other times dependencies would not be updated at all. Additionally, we only used tags which, while good from a release point of view, were not proof against supply chain attacks. This automates the process to hopefully bring in a sense of consistently and allow us to leverage SHA sums to guard against supply chain attacks.

Ensure that the go version (and others) is the same across all points of reference. In the case of golang, we start by derriving the available go version from our distro of choice (Alpine) to ensure that it is used the same everywhere.

Attempts to bound the context a bit when people have to look at these files by splitting them across multiple files and making each one logical part of the CI lifecycle.

With the prevalance of recent supply chain attacks, this helps avert dependency tampering with re-released versions by pinning to specific SHA sums. This is fully compliant with dependabot as it will update both the SHA and the commented version when it does its updates. This also helps prepare for OpenSSF integration by hardening the CI process.

When this is not explicitely set, codeql still works, but if anything ever changes (with autodetection) in the future, it will just silently succeed without producing results. This corrects that by explicitely saying that we want it to look for golang.

Adds a scan target which is automatically added to the prep-release target that checks for grype vulnerabilities during the release preparation flow.

…he PR body

mrueg · 2026-03-26T12:06:52Z

Makefile

 		-w /go/src/github.com/cloudnativelabs/kube-router $(DOCKER_BUILD_IMAGE) \
 		sh -c \
-		'go install gotest.tools/gotestsum@latest && CGO_ENABLED=0 gotestsum --format gotestdox -- -timeout 30s github.com/cloudnativelabs/kube-router/v2/cmd/kube-router/ github.com/cloudnativelabs/kube-router/v2/...'
+		'go install gotest.tools/gotestsum@$(GOTESTSUM_VERSION) && CGO_ENABLED=0 gotestsum --format gotestdox -- -timeout 30s github.com/cloudnativelabs/kube-router/v2/cmd/kube-router/ github.com/cloudnativelabs/kube-router/v2/...'


We should probably use go tool / go mod tool for this instead of go install. See also: https://tip.golang.org/doc/modules/managing-dependencies#tools

aauren added 7 commits March 22, 2026 16:06

fix(prep-release): ensure toolchain atomicity

15c1a3c

Ensure that the go version (and others) is the same across all points of reference. In the case of golang, we start by derriving the available go version from our distro of choice (Alpine) to ensure that it is used the same everywhere.

fix(Makefile): run doctoc and spellcheck from tagged images

26a39f1

feat(prep-release): handle non-versioned docker images also

d6e9861

fix(prep-release): handle yaml anchors in GH actions and add tests

0112be2

feat(ci.yml): add unicode security checks and restructure CI pipeline

89077ef

fix(ci.yml): replace deprecated set-output with GITHUB_OUTPUT

f44abb0

aauren requested review from catherinetcai and mrueg March 22, 2026 23:41

aauren force-pushed the harden_ci_flow_and_checks branch from 087e338 to 74383fa Compare March 22, 2026 23:49

aauren added 7 commits March 22, 2026 18:56

fact(ci): split ci flow across multiple files

8ae1503

Attempts to bound the context a bit when people have to look at these files by splitting them across multiple files and making each one logical part of the CI lifecycle.

feat(Makefile): introduce grype for container scanning

bba0b76

Adds a scan target which is automatically added to the prep-release target that checks for grype vulnerabilities during the release preparation flow.

feat(ci): add sboms and cosign verification for official build artifacts

5cb83cb

feat(Makefile): version gotestsum

432f47f

feat(ci): add OpenSSF scorecard to workflow + README badge

ac1d5fb

aauren force-pushed the harden_ci_flow_and_checks branch from 74383fa to ac1d5fb Compare March 22, 2026 23:58

fix(ci): set provenance mode to min to remove potential errors from t…

03fd533

…he PR body

aauren force-pushed the harden_ci_flow_and_checks branch from 3738670 to 03fd533 Compare March 23, 2026 01:21

mrueg reviewed Mar 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harden CI Flow and Checks#2040

Harden CI Flow and Checks#2040
aauren wants to merge 15 commits intomasterfrom
harden_ci_flow_and_checks

aauren commented Mar 22, 2026 •

edited

Loading

Uh oh!

mrueg Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aauren commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI Security Hardening Pull Request

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR is related to:

Was AI used during the creation of this PR?

What, if any, amount of integration testing was done with this change in a Kubernetes environment?

Does this PR introduce a breaking change?

Anything else the reviewer should know that wasn't already covered?

Uh oh!

mrueg Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aauren commented Mar 22, 2026 •

edited

Loading