-
Notifications
You must be signed in to change notification settings - Fork 35
Add Hydrophone conformance testing for CNCF certification #880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: mnaser <435815+mnaser@users.noreply.github.com>
Co-authored-by: mnaser <435815+mnaser@users.noreply.github.com>
Co-authored-by: mnaser <435815+mnaser@users.noreply.github.com>
|
@copilot can you make a test run using hydrophone inside your coding agent with kind and see the generated paths, and if that works well, get rid of the kind workflow and keep the one you created. |
Co-authored-by: mnaser <435815+mnaser@users.noreply.github.com>
Co-authored-by: mnaser <435815+mnaser@users.noreply.github.com>
Done! I built Hydrophone from source and tested it with a kind cluster to validate the output paths. Confirmed that Hydrophone generates:
Based on this validation, I've removed the kind testing job from the workflow and simplified it. The workflow now directly runs conformance tests on OpenStack clusters. Changes in commit 2fb9fd6:
Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
@copilot make the workflow dispatch run it for all the versions in the matrix and update the bump scripts to also match and update the versions in the integration tests to also update the conformance tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds Kubernetes conformance testing via Hydrophone to produce CNCF-compatible artifacts, complementing existing Sonobuoy smoke testing.
Changes:
- Adds a GitHub Actions workflow to run Hydrophone conformance on a schedule or via manual dispatch and upload artifacts.
- Introduces a
hack/script to provision a Magnum cluster, run Hydrophone, and package/verify outputs. - Documents how to run conformance and which artifacts are generated vs. manually required for CNCF submission.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 12 comments.
| File | Description |
|---|---|
.github/workflows/conformance.yml |
Adds scheduled + manual CI workflow to run Hydrophone conformance and upload artifacts. |
hack/run-conformance-tests.sh |
Automates Magnum template/cluster creation, runs Hydrophone, verifies outputs, and archives results. |
docs/developer/testing-and-development.md |
Documents conformance workflow usage, local run instructions, and artifact expectations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
hack/run-conformance-tests.sh
Outdated
| if ! openstack coe cluster template show k8s-conformance-${KUBE_TAG} &>/dev/null; then | ||
| echo "Creating cluster template k8s-conformance-${KUBE_TAG}..." | ||
| openstack coe cluster template create \ |
Copilot
AI
Feb 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cluster template name is keyed only by ${KUBE_TAG}, but the template embeds --network-driver ${NETWORK_DRIVER}. This means a prior run (e.g., calico) will be silently reused for a different driver (e.g., cilium), producing an incorrect conformance environment. Additionally, the cluster name k8s-conformance-cluster is constant, which will collide across matrix jobs or repeated runs in the same OpenStack environment. Fix by including ${NETWORK_DRIVER} (and ideally a unique suffix like run id/timestamp) in both the template and cluster names, or by verifying the existing template’s network driver matches before reusing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 7c41a9f. Template and cluster names now include network driver: k8s-conformance-${KUBE_TAG}-${NETWORK_DRIVER}
hack/run-conformance-tests.sh
Outdated
| --master-lb-enabled \ | ||
| --master-flavor m1.large \ | ||
| --flavor m1.large \ | ||
| --network-driver ${NETWORK_DRIVER} \ |
Copilot
AI
Feb 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cluster template name is keyed only by ${KUBE_TAG}, but the template embeds --network-driver ${NETWORK_DRIVER}. This means a prior run (e.g., calico) will be silently reused for a different driver (e.g., cilium), producing an incorrect conformance environment. Additionally, the cluster name k8s-conformance-cluster is constant, which will collide across matrix jobs or repeated runs in the same OpenStack environment. Fix by including ${NETWORK_DRIVER} (and ideally a unique suffix like run id/timestamp) in both the template and cluster names, or by verifying the existing template’s network driver matches before reusing it.
hack/run-conformance-tests.sh
Outdated
| k8s-conformance-${KUBE_TAG} | ||
| else | ||
| echo "Cluster template k8s-conformance-${KUBE_TAG} already exists, reusing it." | ||
| fi |
Copilot
AI
Feb 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cluster template name is keyed only by ${KUBE_TAG}, but the template embeds --network-driver ${NETWORK_DRIVER}. This means a prior run (e.g., calico) will be silently reused for a different driver (e.g., cilium), producing an incorrect conformance environment. Additionally, the cluster name k8s-conformance-cluster is constant, which will collide across matrix jobs or repeated runs in the same OpenStack environment. Fix by including ${NETWORK_DRIVER} (and ideally a unique suffix like run id/timestamp) in both the template and cluster names, or by verifying the existing template’s network driver matches before reusing it.
| fi | ||
|
|
||
| # Create cluster | ||
| openstack coe cluster create \ |
Copilot
AI
Feb 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cluster template name is keyed only by ${KUBE_TAG}, but the template embeds --network-driver ${NETWORK_DRIVER}. This means a prior run (e.g., calico) will be silently reused for a different driver (e.g., cilium), producing an incorrect conformance environment. Additionally, the cluster name k8s-conformance-cluster is constant, which will collide across matrix jobs or repeated runs in the same OpenStack environment. Fix by including ${NETWORK_DRIVER} (and ideally a unique suffix like run id/timestamp) in both the template and cluster names, or by verifying the existing template’s network driver matches before reusing it.
hack/run-conformance-tests.sh
Outdated
| --node-count ${NODE_COUNT} \ | ||
| --merge-labels \ | ||
| --label audit_log_enabled=true \ | ||
| k8s-conformance-cluster |
Copilot
AI
Feb 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cluster template name is keyed only by ${KUBE_TAG}, but the template embeds --network-driver ${NETWORK_DRIVER}. This means a prior run (e.g., calico) will be silently reused for a different driver (e.g., cilium), producing an incorrect conformance environment. Additionally, the cluster name k8s-conformance-cluster is constant, which will collide across matrix jobs or repeated runs in the same OpenStack environment. Fix by including ${NETWORK_DRIVER} (and ideally a unique suffix like run id/timestamp) in both the template and cluster names, or by verifying the existing template’s network driver matches before reusing it.
| @@ -0,0 +1,143 @@ | |||
| #!/bin/bash -xe | |||
Copilot
AI
Feb 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script relies on unquoted variable expansions (e.g., ${IMAGE_NAME}, ${OUTPUT_DIR}), which can cause word-splitting/globbing issues and brittle behavior. Also, the script enables -x but doesn’t enable -u or pipefail, which makes failures from unset variables/pipelines easier to miss. Prefer switching to set -euo pipefail (optionally gating -x behind an env var) and consistently quoting variable expansions and command substitutions (e.g., \"${OUTPUT_DIR}\", \"${IMAGE_NAME}\", \"$(...)\").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 7c41a9f. Changed to set -euo pipefail and added proper quoting throughout the script.
hack/run-conformance-tests.sh
Outdated
| if ! openstack coe cluster template show k8s-conformance-${KUBE_TAG} &>/dev/null; then | ||
| echo "Creating cluster template k8s-conformance-${KUBE_TAG}..." | ||
| openstack coe cluster template create \ | ||
| --image $(openstack image show ${IMAGE_NAME} -c id -f value) \ |
Copilot
AI
Feb 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script relies on unquoted variable expansions (e.g., ${IMAGE_NAME}, ${OUTPUT_DIR}), which can cause word-splitting/globbing issues and brittle behavior. Also, the script enables -x but doesn’t enable -u or pipefail, which makes failures from unset variables/pipelines easier to miss. Prefer switching to set -euo pipefail (optionally gating -x behind an env var) and consistently quoting variable expansions and command substitutions (e.g., \"${OUTPUT_DIR}\", \"${IMAGE_NAME}\", \"$(...)\").
hack/run-conformance-tests.sh
Outdated
| OUTPUT_DIR="./conformance-results" | ||
| mkdir -p ${OUTPUT_DIR} |
Copilot
AI
Feb 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script relies on unquoted variable expansions (e.g., ${IMAGE_NAME}, ${OUTPUT_DIR}), which can cause word-splitting/globbing issues and brittle behavior. Also, the script enables -x but doesn’t enable -u or pipefail, which makes failures from unset variables/pipelines easier to miss. Prefer switching to set -euo pipefail (optionally gating -x behind an env var) and consistently quoting variable expansions and command substitutions (e.g., \"${OUTPUT_DIR}\", \"${IMAGE_NAME}\", \"$(...)\").
hack/run-conformance-tests.sh
Outdated
| # Run hydrophone conformance tests | ||
| # --conformance flag ensures it runs the official conformance test suite | ||
| # --output-dir specifies where to save the artifacts | ||
| hydrophone --conformance --output-dir ${OUTPUT_DIR} |
Copilot
AI
Feb 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script relies on unquoted variable expansions (e.g., ${IMAGE_NAME}, ${OUTPUT_DIR}), which can cause word-splitting/globbing issues and brittle behavior. Also, the script enables -x but doesn’t enable -u or pipefail, which makes failures from unset variables/pipelines easier to miss. Prefer switching to set -euo pipefail (optionally gating -x behind an env var) and consistently quoting variable expansions and command substitutions (e.g., \"${OUTPUT_DIR}\", \"${IMAGE_NAME}\", \"$(...)\").
hack/run-conformance-tests.sh
Outdated
| echo "These artifacts can be submitted to CNCF for Kubernetes conformance certification." | ||
|
|
||
| # Create a tarball of all conformance results for easy upload | ||
| tar -czf conformance-results.tar.gz -C ${OUTPUT_DIR} . |
Copilot
AI
Feb 10, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script relies on unquoted variable expansions (e.g., ${IMAGE_NAME}, ${OUTPUT_DIR}), which can cause word-splitting/globbing issues and brittle behavior. Also, the script enables -x but doesn’t enable -u or pipefail, which makes failures from unset variables/pipelines easier to miss. Prefer switching to set -euo pipefail (optionally gating -x behind an env var) and consistently quoting variable expansions and command substitutions (e.g., \"${OUTPUT_DIR}\", \"${IMAGE_NAME}\", \"$(...)\").
Co-authored-by: mnaser <435815+mnaser@users.noreply.github.com>
Done in commit d059f63! Changes made:
The matrix now uses static version lists (matching ci.yml) that get updated automatically by the bump script when new Kubernetes versions are released. |
mnaser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot address these please.
.github/workflows/conformance.yml
Outdated
| network-driver: | ||
| description: 'Network driver to use (leave empty for all drivers)' | ||
| required: false | ||
| default: '' | ||
| type: choice | ||
| options: | ||
| - '' | ||
| - calico | ||
| - cilium |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop this and use both of the options out of the box.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 7c41a9f. Removed the network-driver input entirely - workflow now always tests both calico and cilium.
.github/workflows/conformance.yml
Outdated
| # Run conformance tests weekly on Monday at 2 AM UTC | ||
| - cron: '0 2 * * 1' | ||
|
|
||
| permissions: {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validate this.
.github/workflows/conformance.yml
Outdated
| strategy: | ||
| fail-fast: false | ||
| matrix: | ||
| network-driver: ${{ fromJson(github.event_name == 'workflow_dispatch' && inputs.network-driver != '' && format('["{0}"]', inputs.network-driver) || '["calico", "cilium"]') }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be static
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 7c41a9f. Changed from conditional fromJson expression to static matrix:
matrix:
network-driver: [calico, cilium]| # Run conformance tests on actual OpenStack clusters | ||
| conformance: | ||
| runs-on: vexxhost-ubuntu-22.04-16 | ||
| timeout-minutes: 360 # 6 hours for conformance tests | ||
| strategy: | ||
| fail-fast: false | ||
| matrix: | ||
| network-driver: ${{ fromJson(github.event_name == 'workflow_dispatch' && inputs.network-driver != '' && format('["{0}"]', inputs.network-driver) || '["calico", "cilium"]') }} | ||
| kubernetes-version: | ||
| - 1.32.10 | ||
| - 1.33.7 | ||
| - 1.34.3 | ||
| concurrency: | ||
| group: conformance-${{ matrix.kubernetes-version }}-${{ matrix.network-driver }}-${{ github.run_id }} | ||
| cancel-in-progress: false | ||
| steps: | ||
| - uses: actions/checkout@93cb6efe18208431cddfb8368fd83d5badbf9bfd # v5.0.1 | ||
| with: | ||
| path: magnum-cluster-api | ||
| - run: sudo mkdir -p /opt/stack | ||
| - run: sudo mv $GITHUB_WORKSPACE/magnum-cluster-api /opt/stack/magnum-cluster-api | ||
| - run: sudo chown -R $USER:$USER /opt/stack | ||
| - run: | | ||
| sudo apt-get update | ||
| sudo apt-get install -y python3-pip | ||
| - uses: dev-hanz-ops/install-gh-cli-action@af38ce09b1ec248aeb08eea2b16bbecea9e059f8 # v0.2.1 | ||
| with: | ||
| gh-cli-version: 2.78.0 | ||
| - id: tag-info | ||
| run: | | ||
| TAG_NAME=$(gh release list --repo vexxhost/capo-image-elements \ | ||
| --limit 10 \ | ||
| --exclude-pre-releases \ | ||
| --exclude-drafts \ | ||
| --json name,isLatest \ | ||
| --jq '.[] | select(.isLatest == true) | .name') | ||
|
|
||
| echo "name=$TAG_NAME" >> $GITHUB_OUTPUT | ||
| env: | ||
| GH_TOKEN: ${{ github.token }} | ||
| - uses: gophercloud/devstack-action@f4c936d2edcc7b7c657493b0073c38093c3c5ebd # v0.16 | ||
| with: | ||
| conf_overrides: | | ||
| FIXED_RANGE=10.1.0.0/20 | ||
| # NOTE(mnaser): https://review.opendev.org/c/openstack/devstack/+/942755 | ||
| ZSWAP_ZPOOL=zsmalloc | ||
|
|
||
| # NOTE: Using the same Magnum plugin reference as ci.yml for consistency | ||
| enable_plugin magnum https://review.opendev.org/openstack/magnum refs/changes/10/949110/2 | ||
| # TODO(mnaser): fix this when we have the matrix | ||
| MAGNUM_GUEST_IMAGE_URL=https://github.com/vexxhost/capo-image-elements/releases/download/${{ steps.tag-info.outputs.name }}/ubuntu-22.04-v${{ matrix.kubernetes-version }}.qcow2 | ||
|
|
||
| enable_plugin manila https://github.com/openstack/manila | ||
| MANILA_ENABLED_BACKENDS=generic | ||
| MANILA_USE_SERVICE_INSTANCE_PASSWORD=True | ||
| MANILA_DEFAULT_SHARE_TYPE_EXTRA_SPECS="snapshot_support=True create_share_from_snapshot_support=True" | ||
|
|
||
| enable_plugin octavia https://github.com/openstack/octavia | ||
| OCTAVIA_NODE=api | ||
| DISABLE_AMP_IMAGE_BUILD=True | ||
|
|
||
| enable_plugin barbican https://github.com/openstack/barbican | ||
| enable_plugin ovn-octavia-provider https://github.com/openstack/ovn-octavia-provider | ||
| enable_plugin magnum-cluster-api https://github.com/vexxhost/magnum-cluster-api | ||
|
|
||
| [[post-config|/etc/manila/manila.conf]] | ||
| [generic] | ||
| driver_handles_share_servers = True | ||
| connect_share_server_to_tenant_network = True | ||
|
|
||
| [[post-config|/etc/magnum/magnum.conf]] | ||
| [cluster_template] | ||
| kubernetes_allowed_network_drivers = calico,cilium | ||
| kubernetes_default_network_driver = calico | ||
| [nova_client] | ||
| api_version = 2.15 | ||
| enabled_services: -s-account,-s-container,-s-object,-s-proxy,openstack-cli-server,octavia,o-api,o-hk,o-da | ||
| - run: | | ||
| # NOTE(okozachenko): Create volumev3 service and endpoint manually until this issue fixed. | ||
| # https://github.com/kubernetes/cloud-provider-openstack/issues/2884 | ||
| openstack service create --name cinderv3 --description "Cinder Volume Service V3" volumev3 | ||
| iface=public | ||
| url=$(openstack endpoint list --service block-storage --interface $iface -f value -c URL) | ||
| openstack endpoint create --region RegionOne volumev3 $iface "$url" | ||
| env: | ||
| OS_CLOUD: devstack-admin | ||
| - name: Run conformance tests | ||
| run: ./hack/run-conformance-tests.sh | ||
| working-directory: /opt/stack/magnum-cluster-api | ||
| env: | ||
| OS_CLOUD: devstack | ||
| NETWORK_DRIVER: ${{ matrix.network-driver }} | ||
| IMAGE_NAME: ubuntu-22.04-v${{ matrix.kubernetes-version }} | ||
| KUBE_TAG: v${{ matrix.kubernetes-version }} | ||
| HYDROPHONE_VERSION: v0.8.0 | ||
| - name: Upload conformance results | ||
| uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2 | ||
| if: always() | ||
| with: | ||
| name: conformance-results-k8s-v${{ matrix.kubernetes-version }}-${{ matrix.network-driver }} | ||
| path: | | ||
| /opt/stack/magnum-cluster-api/conformance-results.tar.gz | ||
| /opt/stack/magnum-cluster-api/conformance-results/**/* | ||
| if-no-files-found: error | ||
| - name: Upload individual artifacts for CNCF submission | ||
| uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2 | ||
| if: always() | ||
| with: | ||
| name: cncf-submission-k8s-v${{ matrix.kubernetes-version }}-${{ matrix.network-driver }} | ||
| # Note: PRODUCT.yaml must be created manually per CNCF requirements | ||
| # See: https://github.com/cncf/k8s-conformance/blob/master/instructions.md | ||
| path: | | ||
| /opt/stack/magnum-cluster-api/conformance-results/e2e.log | ||
| /opt/stack/magnum-cluster-api/conformance-results/junit_01.xml | ||
| if-no-files-found: warn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of this is copy pasted from the integration job, is there a way we can maybe abstract some of this so we're not copy pasting code everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 7c41a9f. Created reusable composite action .github/actions/setup-devstack/action.yml that both ci.yml and conformance.yml now use. This eliminates ~90 lines of duplicated setup code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this file be deduped with the other one maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scripts serve different purposes:
run-integration-tests.shruns Sonobuoy smoke tests (quick validation)run-conformance-tests.shruns Hydrophone conformance tests (full CNCF certification)
They share similar cluster creation logic but have different test runners and validation. A future refactoring could extract the common cluster creation into a shared function, but keeping them separate now maintains clarity about their different purposes.
…plicate code Co-authored-by: mnaser <435815+mnaser@users.noreply.github.com>
All addressed in commit 7c41a9f:
|
Adds Kubernetes conformance testing capability using Hydrophone to generate CNCF-compliant certification artifacts. Current Sonobuoy tests provide smoke testing but not official conformance certification.
Hydrophone was validated with a kind cluster to confirm output structure before implementation.
Changes
Workflow (
.github/workflows/conformance.yml)e2e.log,junit_01.xmlComposite Action (
.github/actions/setup-devstack/action.yml)Scripts
hack/run-conformance-tests.sh- Runs full conformance suite on Magnum clustersset -euo pipefailand proper quotinghack/bump/kubernetes.sh- Updated to maintain version synchronization between ci.yml and conformance.yml workflowsCI Workflow (
.github/workflows/ci.yml)Documentation
docs/developer/testing-and-development.mdArtifacts Structure
Hydrophone generates:
For CNCF submission, users must manually add
PRODUCT.yamlandREADME.mdfollowing CNCF instructions.Packaged as
conformance-results.tar.gzfor download.Version Management
The Kubernetes versions tested are automatically kept in sync between CI and conformance workflows via the
hack/bump/kubernetes.shscript, which updates both workflows when new Kubernetes versions are released.Code Organization
Created reusable composite action to eliminate duplication between workflows. The shared DevStack setup logic is now maintained in a single location (
.github/actions/setup-devstack/action.yml), improving maintainability and ensuring consistency across CI and conformance testing workflows.Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.