Conversation
This requires upgrading golangci-lint to v2, which involves changing its config.
Each DaemonSet will only schedule pods on nodes with the corresponding OS version, preventing compatibility issues between different Ubuntu distributions.
This test needed to be updated to handle the new daemonset deployment style.
…ley/fix-fault-test Fix fault-test.sh
Bump version to v0.4.0
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dabradley The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
Release-focused update preparing v0.4.0, including dependency/tooling bumps, new OS-specific node DaemonSets, and improved test/security hardening.
Changes:
- Introduces Ubuntu flavor-specific node DaemonSets (jammy/noble) and updates install/uninstall + long-haul scripts accordingly.
- Updates Go/K8s/Azure dependencies and tightens linting/CI tooling configuration.
- Improves command execution safety and reliability in tests and runtime scripts (context-aware exec, taint-removal refactor, entrypoint validation).
Reviewed changes
Copilot reviewed 35 out of 2803 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| test/utils/credentials/credentials.go | Adds gosec suppression + safer file close handling in tests. |
| test/sanity/sanity_test.go | Uses context-aware command execution for sanity tests. |
| test/long-haul/utils.sh | Resets driver across multiple node DaemonSet flavors; updates pod selection logic. |
| test/long-haul/update-test.sh | Selects node pod via labels for flavor-agnostic behavior. |
| test/long-haul/fault-test.sh | Targets the correct node DaemonSet by deriving it from the running pod. |
| test/integration_dalec/setup_integration_test_dalec.sh | Updates example image version for v0.4.0. |
| test/integration/integration_test.go | Adds minimal input validation + uses CommandContext. |
| pkg/csi-common/server.go | Switches to net.ListenConfig.Listen with context. |
| pkg/csi-common/driver_test.go | Bumps test vendor version to v0.4.0. |
| pkg/azurelustreplugin/main.go | Checks fmt.Println error when printing version info. |
| pkg/azurelustreplugin/entrypoint.sh | Tightens env var requirements; improves OS/arch checks and logging. |
| pkg/azurelustreplugin/Dockerfile | Parameterizes base image; reduces installed packages. |
| pkg/azurelustre/nodeserver.go | Uses volumehelper.MakeDir instead of os.MkdirAll. |
| pkg/azurelustre/dynamic_provisioning.go | Refactors provisioning-state handling; adds nil guard for subnet size. |
| pkg/azurelustre/controllerserver_test.go | Fixes testify assert usage. |
| pkg/azurelustre/azurelustre_test.go | Refactors fake driver usage; adds synctest-based taint-removal tests. |
| pkg/azurelustre/azurelustre.go | Updates cloud-provider-azure config type; refactors taint-removal hook. |
| hack/verify-golint.sh | Updates golangci-lint installer/version gate logic. |
| hack/verify-gofmt.sh | Avoids failing early on gofmt invocation errors (currently too permissive). |
| go.mod | Updates Go version and major dependency versions (K8s, Azure SDK, etc.). |
| docs/driver-parameters.md | Adjusts documented required permissions. |
| docs/csi-dev.md | Updates build/push docs to ACR-oriented flow. |
| docs/csi-debug.md | Expands debugging guidance, especially for OS-specific DaemonSets. |
| deploy/uninstall-driver.sh | Removes legacy node DaemonSet and uninstalls jammy/noble DaemonSets. |
| deploy/install-driver.sh | Installs jammy/noble DaemonSets and removes legacy node DaemonSet. |
| deploy/csi-azurelustre-node.yaml | Removes legacy single node DaemonSet manifest. |
| deploy/csi-azurelustre-node-noble.yaml | Adds Ubuntu 24.04 (noble) node DaemonSet manifest. |
| deploy/csi-azurelustre-node-jammy.yaml | Adds Ubuntu 22.04/20.04(CVM) (jammy) node DaemonSet manifest. |
| deploy/csi-azurelustre-controller.yaml | Updates sidecar versions and controller image tag. |
| deploy/README-distribution-specific.md | Documents the new distribution-specific DaemonSet approach. |
| README.md | Adds v0.4.0 to the version table. |
| Makefile | Bumps version; builds/pushes flavor-tagged images; changes CGO behavior. |
| .golangci.yaml | Migrates to golangci-lint v2 config format; expands enabled linters/formatters. |
| .github/workflows/trivy.yaml | Scans both jammy/noble test images. |
| .github/workflows/static.yaml | Updates golangci-lint GitHub Action and linter version pin. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Update documentation to correctly acknowledge new image flavors
e584d79 to
117b328
Compare
There was a problem hiding this comment.
Pull request overview
This PR prepares the v0.4.0 release by updating versioning/dependencies, introducing distro-specific node DaemonSets/images (jammy/noble), and tightening CI/linting and test tooling.
Changes:
- Release bump to v0.4.0 (docs/manifests/tests), plus new jammy/noble node DaemonSets and image tags.
- Tooling updates: Go version bump, golangci-lint v2 config, workflow/tool scripts and dependency upgrades.
- Reliability/security tweaks: context-aware exec in tests, improved file handling, extra nil checks, and operational script adjustments.
Reviewed changes
Copilot reviewed 35 out of 2803 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| test/utils/credentials/credentials.go | Suppress gosec warning for test-only file read; safer file close handling |
| test/sanity/sanity_test.go | Use exec with context for sanity script |
| test/long-haul/utils.sh | Handle multiple node DaemonSet flavors; label-based pod selection |
| test/long-haul/update-test.sh | Select node pods via label selector |
| test/long-haul/fault-test.sh | Patch the correct DaemonSet by discovering owner from a node pod |
| test/integration_dalec/setup_integration_test_dalec.sh | Update example image tag to v0.4.0 flavor |
| test/integration/integration_test.go | Add basic input validation and exec with context |
| pkg/csi-common/server.go | Use net.ListenConfig with context |
| pkg/csi-common/driver_test.go | Bump vendor version used by tests |
| pkg/azurelustreplugin/main.go | Check fmt.Println error when printing version info |
| pkg/azurelustreplugin/entrypoint.sh | Add OS/arch checks and require env vars for Lustre client installs |
| pkg/azurelustreplugin/Dockerfile | Parameterize base image; shrink installed packages |
| pkg/azurelustre/nodeserver.go | Switch directory creation helper for subdir creation |
| pkg/azurelustre/dynamic_provisioning.go | Adjust provisioning-state handling; add nil guard for subnet size |
| pkg/azurelustre/controllerserver_test.go | Fix testify assert.Empty usage |
| pkg/azurelustre/azurelustre_test.go | Update version/driver naming; add synctest-based taint removal test |
| pkg/azurelustre/azurelustre.go | Use cloud-provider-azure config type; refactor taint removal trigger |
| hack/verify-golint.sh | Require golangci-lint v2; update installer version |
| hack/verify-gofmt.sh | Avoid failing script when gofmt pipeline errors |
| go.mod | Bump Go version and key dependencies (Azure SDK, k8s, cloud-provider-azure, etc.) |
| docs/driver-parameters.md | Adjust listed permissions |
| docs/csi-dev.md | Update push instructions to ACR flow/targets |
| docs/csi-debug.md | Significant debugging doc expansion for distro-specific DaemonSets |
| deploy/uninstall-driver.sh | Delete all node DaemonSets by label; remove old single-DS uninstall |
| deploy/install-driver.sh | Install jammy+noble node DaemonSets; remove old single-DS install/rollout |
| deploy/csi-azurelustre-node.yaml | Remove legacy single node DaemonSet manifest |
| deploy/csi-azurelustre-node-noble.yaml | Add Ubuntu 24.04 node DaemonSet |
| deploy/csi-azurelustre-node-jammy.yaml | Add Ubuntu 22.04/20.04(CVM) node DaemonSet |
| deploy/csi-azurelustre-controller.yaml | Update sidecar versions; move controller image to v0.4.0-jammy |
| deploy/README-distribution-specific.md | New docs for distro-specific node deployments |
| README.md | Document new image flavoring and v0.4.0 images |
| Makefile | Build/push jammy+noble images; bump IMAGE_VERSION; enable CGO |
| .golangci.yaml | Migrate to golangci-lint v2 config format; expand enabled linters/formatters |
| .github/workflows/trivy.yaml | Scan both jammy and noble images |
| .github/workflows/static.yaml | Update lint action/version and configure golangci-lint v2 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if [[ "${installClientPackages}" == "yes" ]]; then | ||
| if [[ -z ${LUSTRE_VERSION} ]]; then | ||
| echo "LUSTRE_VERSION environment variable is not set" | ||
| exit 1 | ||
| fi | ||
| requiredLustreVersion=${LUSTRE_VERSION} | ||
| echo "requiredLustreVersion: ${requiredLustreVersion}" | ||
|
|
||
| requiredClientSha=${CLIENT_SHA_SUFFIX:-"33-g79ddf99"} | ||
| echo "requiredClientSha: ${requiredClientSha}" | ||
| if [[ -z ${CLIENT_SHA_SUFFIX} ]]; then | ||
| echo "CLIENT_SHA_SUFFIX environment variable is not set" | ||
| exit 1 | ||
| fi | ||
| requiredClientSha="${CLIENT_SHA_SUFFIX}" |
There was a problem hiding this comment.
The script now hard-requires LUSTRE_VERSION and CLIENT_SHA_SUFFIX when AZURELUSTRE_CSI_INSTALL_LUSTRE_CLIENT is yes (default). This will break controller pods (and any deployment) that don’t set these vars, and [[ -z ${LUSTRE_VERSION} ]] can also fail under set -u when unset. Consider either (a) restoring safe defaults (as before) for these env vars, or (b) changing the default installClientPackages to no and explicitly setting it to yes (plus the required vars) only in node DaemonSets; also use "${LUSTRE_VERSION:-}" / "${CLIENT_SHA_SUFFIX:-}" in the -z checks.
| image: mcr.microsoft.com/oss/v2/kubernetes-csi/azurelustre-csi:v0.4.0-jammy | ||
| imagePullPolicy: Always | ||
| args: | ||
| - "-v=5" |
There was a problem hiding this comment.
With the entrypoint now defaulting AZURELUSTRE_CSI_INSTALL_LUSTRE_CLIENT to yes and requiring LUSTRE_VERSION/CLIENT_SHA_SUFFIX, the controller Deployment will likely crash-loop unless it explicitly disables client installation (controller doesn’t need Lustre kernel modules). Recommended: set AZURELUSTRE_CSI_INSTALL_LUSTRE_CLIENT="no" in the controller container env (or provide the required Lustre env vars), and keep Lustre install enabled only for node DaemonSets.
| echo Executing: "$1 ${2-} ${3-} ${4-} ${5-} ${6-} ${7-} ${8-} ${9-}" | ||
| $1 ${2-} ${3-} ${4-} ${5-} ${6-} ${7-} ${8-} ${9-} |
There was a problem hiding this comment.
Executing via $1 ${2-} ... ${9-} loses arguments beyond $9 and reintroduces word-splitting/globbing bugs for arguments containing whitespace/special chars. Prefer echo "Executing: $*" (or $@ for exact args) and then exec "$@" to preserve all arguments reliably and replace the shell process.
| echo Executing: "$1 ${2-} ${3-} ${4-} ${5-} ${6-} ${7-} ${8-} ${9-}" | |
| $1 ${2-} ${3-} ${4-} ${5-} ${6-} ${7-} ${8-} ${9-} | |
| echo "Executing: $*" | |
| exec "$@" |
| echo "Verifying gofmt" | ||
|
|
||
| diff=$(find . -name "*.go" | grep -v "\/vendor\/" | xargs gofmt -s -d 2>&1) | ||
| diff=$(find . -name "*.go" | grep -v "\/vendor\/" | xargs gofmt -s -d 2>&1 || true) |
There was a problem hiding this comment.
|| true will mask real failures (e.g., gofmt not found, filesystem errors, xargs errors), which can cause the verify step to pass when it should fail. Prefer preserving the original exit code and only treating “diff output exists” as failure (e.g., capture output without swallowing non-zero exit statuses, or explicitly handle the “no files” xargs case with -r where available).
| diff=$(find . -name "*.go" | grep -v "\/vendor\/" | xargs gofmt -s -d 2>&1 || true) | |
| diff=$(find . -name "*.go" ! -path "./vendor/*" -print0 | xargs -0 gofmt -s -d 2>&1) |
…-contacts Update security contacts by replacing Justin-Jin with dabradley
…-for-dev-branch Pin trivy action to SHA
…-for-dev-branch Pin trivy action to SHA
These workflows were either never supported, such as the non-linux builds, or are no longer supported, such as the CI test runs, which we now do internally before releases.
Pin all third-party GitHub Actions to full commit SHAs instead of mutable version tags, with version comments for readability. This prevents supply chain attacks where a tag could be moved to point to malicious code. Workflows updated: - codeql-analysis.yml - codespell.yml - linux.yaml - sanity_test_local.yaml - static.yaml - trivy.yaml
Add top-level 'permissions: contents: read' to all remaining workflows to follow the principle of least privilege. The codeql-analysis.yml job-level permissions block is simplified to only declare security-events: write (required for SARIF upload), removing the redundant actions: read permission. Workflows updated: - codeql-analysis.yml - codespell.yml - linux.yaml - sanity_test_local.yaml - static.yaml - trivy.yaml
The */24 step value in the hour field (0-23) is invalid/misleading since 24 exceeds the valid range. Replaced with a daily run at 03:37 UTC to avoid peak scheduling contention.
Harden workflows
117b328 to
87e6b9f
Compare
What type of PR is this?
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #
Requirements:
Special notes for your reviewer:
Release note: