OSAC-865: add Slack notification step for CI jobs#79329
OSAC-865: add Slack notification step for CI jobs#79329omer-vishlitzky wants to merge 2 commits into
Conversation
… workflow The baremetalds-devscripts-ibm step only uncommented #metalink= lines (CentOS Stream). Rocky Linux uses #mirrorlist= instead, so on Rocky hosts the step commented out the IBM baseurl but left no working alternative, causing "Cannot find a valid baseurl" errors. Add #mirrorlist= handling and include the step in the osac-project cluster-tool-vmaas workflow.
|
@omer-vishlitzky: This pull request references OSAC-865 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited) Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (5)
✅ Files skipped from review due to trivial changes (3)
🚧 Files skipped from review as they are similar to previous changes (2)
WalkthroughAdds a new ChangesCI Workflow Enhancement
sequenceDiagram
autonumber
actor CIJob as "CI job container"
participant Vault as "Vault (test-credentials)"
participant Prow as "Prow (logs)"
participant Slack as "Slack webhook"
CIJob->>Vault: read /var/run/vault/osac-slack-webhook/url
CIJob->>Prow: construct View logs URL (presubmit vs others)
CIJob->>Slack: POST JSON message (job result + View logs link)
Slack-->>CIJob: 200/response
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested labels
🚥 Pre-merge checks | ✅ 12✅ Passed checks (12 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: omer-vishlitzky The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Add osac-project-notify step-registry ref that posts job results to Slack via incoming webhook. Wired into e2e-vmaas presubmit for testing — will be moved to periodic jobs after validation.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
ci-operator/config/osac-project/osac-test-infra/osac-project-osac-test-infra-main.yaml (1)
69-74: ⚖️ Poor tradeoffConsider moving notification to workflow after validation.
The test explicitly defines post steps that duplicate four of the five steps already present in the
osac-project-cluster-tool-vmaasworkflow (lines 13-17 in the workflow file). Onlyosac-project-notifyis new. This duplication creates maintenance burden—if the workflow's post steps evolve, this test configuration must be updated separately.Since the workflow already specifies
allow_best_effort_post_steps: true, addingosac-project-notifydirectly to the workflow's post steps would eliminate the duplication and ensure all tests using this workflow benefit from notifications.The PR objectives mention this is intentionally wired into
e2e-vmaasfor validation first, so this comment is a reminder to refactor once validation succeeds.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/config/osac-project/osac-test-infra/osac-project-osac-test-infra-main.yaml` around lines 69 - 74, This test config duplicates post steps already declared in the osac-project-cluster-tool-vmaas workflow and only adds osac-project-notify; remove the duplicated refs (osac-project-gather, osac-project-cluster-tool-destroy, ofcir-gather, ofcir-release) from the test's post list and instead add the osac-project-notify step to the osac-project-cluster-tool-vmaas workflow's post steps (the workflow already has allow_best_effort_post_steps: true), so the notification is handled centrally and the test no longer duplicates workflow post logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@ci-operator/step-registry/osac-project/notify/osac-project-notify-commands.sh`:
- Around line 6-17: The script exposes the sensitive WEBHOOK_URL when shell
tracing (set -x) is enabled; wrap the secret handling (reading WEBHOOK_URL from
/var/run/vault/osac-slack-webhook/url and the subsequent curl invocation that
uses "${WEBHOOK_URL}") with a temporary disabled-tracing block: save current
xtrace state, run commands that read/use WEBHOOK_URL with tracing turned off,
then restore the original xtrace state so other steps remain unaffected; ensure
both the assignment to WEBHOOK_URL and the curl --data invocation are inside
this protected block.
---
Nitpick comments:
In
`@ci-operator/config/osac-project/osac-test-infra/osac-project-osac-test-infra-main.yaml`:
- Around line 69-74: This test config duplicates post steps already declared in
the osac-project-cluster-tool-vmaas workflow and only adds osac-project-notify;
remove the duplicated refs (osac-project-gather,
osac-project-cluster-tool-destroy, ofcir-gather, ofcir-release) from the test's
post list and instead add the osac-project-notify step to the
osac-project-cluster-tool-vmaas workflow's post steps (the workflow already has
allow_best_effort_post_steps: true), so the notification is handled centrally
and the test no longer duplicates workflow post logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: ed8cd431-5ac1-407c-aa52-a93c531073ee
📒 Files selected for processing (7)
ci-operator/config/osac-project/osac-test-infra/osac-project-osac-test-infra-main.yamlci-operator/step-registry/baremetalds/devscripts/ibm/baremetalds-devscripts-ibm-commands.shci-operator/step-registry/osac-project/cluster-tool/vmaas/osac-project-cluster-tool-vmaas-workflow.yamlci-operator/step-registry/osac-project/notify/OWNERSci-operator/step-registry/osac-project/notify/osac-project-notify-commands.shci-operator/step-registry/osac-project/notify/osac-project-notify-ref.metadata.jsonci-operator/step-registry/osac-project/notify/osac-project-notify-ref.yaml
| WEBHOOK_URL="$(cat /var/run/vault/osac-slack-webhook/url)" | ||
| PROW_URL="https://prow.ci.openshift.org/view/gs/test-platform-results" | ||
|
|
||
| if [[ "${JOB_TYPE:-}" == "presubmit" ]]; then | ||
| JOB_URL="${PROW_URL}/pr-logs/pull/${REPO_OWNER}_${REPO_NAME}/${PULL_NUMBER}/${JOB_NAME}/${BUILD_ID}" | ||
| else | ||
| JOB_URL="${PROW_URL}/logs/${JOB_NAME}/${BUILD_ID}" | ||
| fi | ||
|
|
||
| curl -s -X POST -H 'Content-type: application/json' \ | ||
| --data "{\"text\":\"Job *${JOB_NAME}* #${BUILD_ID} completed.\n<${JOB_URL}|View logs>\"}" \ | ||
| "${WEBHOOK_URL}" |
There was a problem hiding this comment.
Protect webhook URL from leaking to CI logs.
The script reads and uses a sensitive webhook URL without disabling shell tracing. If shell tracing is enabled in the CI environment, the webhook URL will be echoed to logs, exposing the credential. As per coding guidelines, step registry scripts must disable shell tracing temporarily when handling sensitive operations.
🔒 Proposed fix to protect the webhook URL
set -o nounset
set -o pipefail
+set +x # Disable tracing to protect webhook URL
WEBHOOK_URL="$(cat /var/run/vault/osac-slack-webhook/url)"
PROW_URL="https://prow.ci.openshift.org/view/gs/test-platform-results"
if [[ "${JOB_TYPE:-}" == "presubmit" ]]; then
JOB_URL="${PROW_URL}/pr-logs/pull/${REPO_OWNER}_${REPO_NAME}/${PULL_NUMBER}/${JOB_NAME}/${BUILD_ID}"
else
JOB_URL="${PROW_URL}/logs/${JOB_NAME}/${BUILD_ID}"
fi
curl -s -X POST -H 'Content-type: application/json' \
--data "{\"text\":\"Job *${JOB_NAME}* #${BUILD_ID} completed.\n<${JOB_URL}|View logs>\"}" \
"${WEBHOOK_URL}"
+
+set -x # Re-enable tracing if it was onAs per coding guidelines, step registry scripts must protect sensitive information from leaking into CI logs and disable shell tracing when handling secrets.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/osac-project/notify/osac-project-notify-commands.sh`
around lines 6 - 17, The script exposes the sensitive WEBHOOK_URL when shell
tracing (set -x) is enabled; wrap the secret handling (reading WEBHOOK_URL from
/var/run/vault/osac-slack-webhook/url and the subsequent curl invocation that
uses "${WEBHOOK_URL}") with a temporary disabled-tracing block: save current
xtrace state, run commands that read/use WEBHOOK_URL with tracing turned off,
then restore the original xtrace state so other steps remain unaffected; ensure
both the assignment to WEBHOOK_URL and the curl --data invocation are inside
this protected block.
e05b31c to
419d306
Compare
|
/pj-rehearse pull-ci-osac-project-osac-test-infra-main-e2e-vmaas |
|
@omer-vishlitzky: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
[REHEARSALNOTIFIER]
A total of 3987 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs. A full list of affected jobs can be found here Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
/pj-rehearse pull-ci-osac-project-osac-test-infra-main-e2e-vmaas |
|
@omer-vishlitzky: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@omer-vishlitzky: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/hold |
Summary
osac-project-notifystep-registry ref that posts job results to Slack via incoming webhookTest plan
/test e2e-vmaasrehearsal to confirm the webhook fires and Slack message is receivedSummary
This PR adds Slack notification capability to the OpenShift CI infrastructure by introducing a new
osac-project-notifystep-registry entry. This step posts job completion results to Slack via an incoming webhook, enabling real-time notification of CI job outcomes.Changes
New Slack Notification Step:
osac-project-notifystep that reads a Slack webhook URL from Vault (test-credentialsnamespace) and sends job completion notificationsdev-scriptsbase image with minimal resource requests (100m CPU, 200Mi memory)Integration:
e2e-vmaaspresubmit test's post-pipeline on theosac-test-infrarepository for initial validation before broader rollout to periodic jobsosac-project-cluster-tool-vmaasworkflow to include thebaremetalds-devscripts-ibmstepSupporting Changes:
baremetalds-devscripts-ibm-commands.shto uncommentmirrorlist=entries in addition to existingmetalink=handling, improving compatibility with community mirrors (CentOS Stream + Rocky)Testing
The PR includes a test plan to run the
/test e2e-vmaasrehearsal command to validate that the webhook integration works and Slack messages are successfully delivered.