TEST: Add debug workflow for techpreview serial testing#79352
Conversation
Adds openshift-e2e-aws-ovn-serial-debug workflow with:
- cucushift-installer-wait step for extended cluster access
- SLEEP_DURATION environment variable support (default 2h, max 72h)
- 12h timeout for debugging scenarios
- Same configuration as techpreview-serial (AWS + OVN + TechPreview)
This allows QE engineers to debug clusters for up to 8 hours
using SLEEP_DURATION environment variable via gangway-cli.
Usage:
gangway-cli --job-name periodic-ci-openshift-release-main-ci-4.22-e2e-aws-ovn-techpreview-serial-debug \
--initial <release-image> \
--env SLEEP_DURATION=8h \
--env TEST_ARGS="--dry-run"
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited) Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughAdds a new OpenShift e2e AWS OVN serial-debug CI workflow, its metadata and OWNERS, and registers the ChangesAWS OVN Serial Debug Workflow
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 12✅ Passed checks (12 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: weliang1 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@ci-operator/config/openshift/release/openshift-release-main__ci-4.22.yaml`:
- Around line 147-157: The CI config defines a new test entry
e2e-aws-ovn-techpreview-serial-debug (workflow:
openshift-e2e-aws-ovn-serial-debug) but the generated Prow job manifests are
missing; run make update locally to regenerate downstream job configs (which
will create/modify files under ci-operator/jobs/) and commit those generated
changes along with your YAML edit so the new test exists in both
ci-operator/config and the generated ci-operator/jobs outputs.
In
`@ci-operator/step-registry/openshift/e2e/aws/ovn/serial-debug/openshift-e2e-aws-ovn-serial-debug-workflow.yaml`:
- Around line 31-32: Update the workflow documentation stanza that describes
SLEEP_DURATION to reflect the effective cap enforced by this workflow: change
the "max 72h" wording to state the 12h enforced timeout (or explicitly state
"max 72h, but capped to 12h in this workflow") so users aren't misled; reference
the SLEEP_DURATION environment variable and the cucushift-installer-wait usage
in the openshift-e2e-aws-ovn-serial-debug-workflow.yaml so it's clear the 12h
wait timeout applies to this workflow.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 8094d9fc-3eb0-428f-8509-deb1148c4d8a
⛔ Files ignored due to path filters (1)
ci-operator/jobs/openshift/release/openshift-release-main-periodics.yamlis excluded by!ci-operator/jobs/**
📒 Files selected for processing (3)
ci-operator/config/openshift/release/openshift-release-main__ci-4.22.yamlci-operator/step-registry/openshift/e2e/aws/ovn/serial-debug/openshift-e2e-aws-ovn-serial-debug-workflow.metadata.jsonci-operator/step-registry/openshift/e2e/aws/ovn/serial-debug/openshift-e2e-aws-ovn-serial-debug-workflow.yaml
| - as: e2e-aws-ovn-techpreview-serial-debug | ||
| interval: 168h | ||
| steps: | ||
| cluster_profile: openshift-org-aws | ||
| env: | ||
| FEATURE_SET: TechPreviewNoUpgrade | ||
| observers: | ||
| enable: | ||
| - observers-resource-watch | ||
| workflow: openshift-e2e-aws-ovn-serial-debug | ||
| timeout: 12h0m0s |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Confirm the new test stanza exists in ci-operator config
rg -n --iglob '*.yaml' 'as:\s*e2e-aws-ovn-techpreview-serial-debug' ci-operator/config
# Confirm generated Prow job manifests reference the new test/job name
if [ -d ci-operator/jobs ]; then
rg -n --iglob '*.ya?ml' \
'periodic-ci-openshift-release-main-ci-4\.22-e2e-aws-ovn-techpreview-serial-debug|e2e-aws-ovn-techpreview-serial-debug' \
ci-operator/jobs
else
echo "ci-operator/jobs directory is not present in this checkout."
fiRepository: openshift/release
Length of output: 182
Run make update to generate downstream Prow job configuration for this new test.
The test entry e2e-aws-ovn-techpreview-serial-debug exists in the CI configuration, but the corresponding generated Prow job config was not found. Per the coding guidelines for ci-operator/config/**/*.yaml, after editing the configuration file, you must run make update to generate the downstream Prow job manifests in ci-operator/jobs/.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@ci-operator/config/openshift/release/openshift-release-main__ci-4.22.yaml`
around lines 147 - 157, The CI config defines a new test entry
e2e-aws-ovn-techpreview-serial-debug (workflow:
openshift-e2e-aws-ovn-serial-debug) but the generated Prow job manifests are
missing; run make update locally to regenerate downstream job configs (which
will create/modify files under ci-operator/jobs/) and commit those generated
changes along with your YAML edit so the new test exists in both
ci-operator/config and the generated ci-operator/jobs outputs.
| - Includes cucushift-installer-wait step for extended debugging (up to 12 hours) | ||
| - Supports SLEEP_DURATION environment variable (default 2h, max 72h) |
There was a problem hiding this comment.
Align debug-duration docs with the enforced 12h limit.
Line 32 says SLEEP_DURATION supports up to 72h, but Line 15 enforces a 12h wait timeout in this workflow. Please document the effective cap for this workflow to avoid failed expectations.
Suggested doc fix
- - Includes cucushift-installer-wait step for extended debugging (up to 12 hours)
- - Supports SLEEP_DURATION environment variable (default 2h, max 72h)
+ - Includes cucushift-installer-wait step for extended debugging (up to 12 hours in this workflow)
+ - Supports SLEEP_DURATION environment variable (default 2h; effective max 12h due to workflow timeout)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - Includes cucushift-installer-wait step for extended debugging (up to 12 hours) | |
| - Supports SLEEP_DURATION environment variable (default 2h, max 72h) | |
| - Includes cucushift-installer-wait step for extended debugging (up to 12 hours in this workflow) | |
| - Supports SLEEP_DURATION environment variable (default 2h; effective max 12h due to workflow timeout) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/openshift/e2e/aws/ovn/serial-debug/openshift-e2e-aws-ovn-serial-debug-workflow.yaml`
around lines 31 - 32, Update the workflow documentation stanza that describes
SLEEP_DURATION to reflect the effective cap enforced by this workflow: change
the "max 72h" wording to state the 12h enforced timeout (or explicitly state
"max 72h, but capped to 12h in this workflow") so users aren't misled; reference
the SLEEP_DURATION environment variable and the cucushift-installer-wait usage
in the openshift-e2e-aws-ovn-serial-debug-workflow.yaml so it's clear the 12h
wait timeout applies to this workflow.
- Add OWNERS file required by step-registry-metadata check - Update metadata.json format (auto-generated by make update) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
[REHEARSALNOTIFIER]
Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
/pj-rehearse periodic-ci-openshift-release-main-ci-4.22-e2e-aws-ovn-techpreview-serial-debug |
|
@weliang1: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/cancle |
|
@weliang1: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Adds openshift-e2e-aws-ovn-serial-debug workflow with:
This allows QE engineers to debug clusters for up to 8 hours using SLEEP_DURATION environment variable via gangway-cli.
Usage:
gangway-cli --job-name periodic-ci-openshift-release-main-ci-4.22-e2e-aws-ovn-techpreview-serial-debug
--initial
--env SLEEP_DURATION=8h
--env TEST_ARGS="--dry-run"
Summary
This PR adds a new debug workflow to the OpenShift CI configuration in the openshift/release repository. The workflow, openshift-e2e-aws-ovn-serial-debug, provides extended cluster access for debugging TechPreview serial test runs (AWS + OVN) by keeping the cluster available after test execution.
What changed (practical impact)
These changes affect CI job definitions and step-registry workflows used by OpenShift CI; no application code or public library APIs are changed.
Key features and operational notes
Files added/updated
Risk and review effort