Commit 5573b55
committed
pkg/monitortests/clusterversionoperator/legacycvomonitortests: Expand monitoring reason exceptions
To cover the main hits in:
$ curl -s 'https://search.ci.openshift.org/search?maxAge=48h&type=junit&name=4.15.*upgrade&context=0&search=clusteroperator/monitoring.*condition/Available.*status/False' | jq -r 'to_entries[].value | to_entries[].value[].context[]' | sed 's|.*clusteroperator/\([^ ]*\) condition/Available reason/\([^ ]*\) status/False[^:]*: \(.*\)|\1 \2 \3|' | sort | uniq -c | sort -n
1 monitoring PlatformTasksFailed reconciling Console Plugin Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/monitoring-plugin: context deadline exceeded, UpdatingAlertmanager: waiting for Alertmanager object changes failed: waiting for Alertmanager openshift-monitoring/main: context deadline exceeded, UpdatingPrometheusAdapter: reconciling PrometheusAdapter Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/prometheus-adapter: context deadline exceeded, UpdatingThanosQuerier: reconciling Thanos Querier Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/thanos-querier: context deadline exceeded, UpdatingPrometheus: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline
1 monitoring PlatformTasksFailed reconciling Console Plugin failed: retrieving ConsolePlugin object failed: conversion webhook for console.openshift.io/v1alpha1, Kind=ConsolePlugin failed: Post "https://webhook.openshift-console-operator.svc:9443/crdconvert?timeout=30s": no endpoints available for service "webhook", UpdatingPrometheus: reconciling Prometheus object failed: updating Prometheus object failed: Operation cannot be fulfilled on prometheuses.monitoring.coreos.com "k8s": the object has been modified; please apply your changes to the latest version and try again
1 monitoring UpdatingPrometheusOperatorFailed reconciling Prometheus Operator Admission Webhook Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/prometheus-operator-admission-webhook: context deadline exceeded
3 monitoring UpdatingAlertmanagerFailed reconciling Alertmanager object failed: updating Alertmanager object failed: Operation cannot be fulfilled on alertmanagers.monitoring.coreos.com "main": the object has been modified; please apply your changes to the latest version and try again
4 monitoring PlatformTasksFailed reconciling Console Plugin failed: retrieving ConsolePlugin object failed: conversion webhook for console.openshift.io/v1alpha1, Kind=ConsolePlugin failed: Post "https://webhook.openshift-console-operator.svc:9443/crdconvert?timeout=30s": no endpoints available for service "webhook", UpdatingPrometheus: client rate limiter Wait returned an error: context deadline exceeded
15 monitoring UpdatingConsolePluginComponentsFailed reconciling Console Plugin failed: retrieving ConsolePlugin object failed: conversion webhook for console.openshift.io/v1alpha1, Kind=ConsolePlugin failed: Post "https://webhook.openshift-console-operator.svc:9443/crdconvert?timeout=30s": no endpoints available for service "webhook"
I've also commented about these in [1], but I have no problem if folks
decide to fork that bug into multiple trackers and have specific
exceptions for each tracker.
And I'm including UpdatingPrometheusFailed for [2]:
: [bz-Monitoring] clusteroperator/monitoring should not change condition/Available expand_less 1h54m23s
{ 2 unexpected clusteroperator state transitions during e2e test run. These did not match any known exceptions, so they cause this test-case to fail:
Nov 28 20:24:17.720 W clusteroperator/monitoring condition/Available reason/UpdatingPrometheusFailed status/Unknown UpdatingPrometheus: client rate limiter Wait returned an error: context deadline exceeded
Nov 28 20:24:17.720 - 110s E clusteroperator/monitoring condition/Available reason/UpdatingPrometheusFailed status/Unknown UpdatingPrometheus: client rate limiter Wait returned an error: context deadline exceeded
1 unwelcome but acceptable clusteroperator state transitions during e2e test run. These should not happen, but because they are tied to exceptions, the fact that they did happen is not sufficient to cause this test-case to fail:
Nov 28 20:26:08.168 W clusteroperator/monitoring condition/Available reason/RollOutDone status/True Successfully rolled out the stack. (exception: Available=True is the happy case)
}
[1]: https://issues.redhat.com/browse/OCPBUGS-23745
[2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/27231/pull-ci-openshift-origin-master-e2e-gcp-ovn-rt-upgrade/17295645937112227841 parent ca46d5d commit 5573b55
File tree
1 file changed
+1
-1
lines changed- pkg/monitortests/clusterversionoperator/legacycvomonitortests
1 file changed
+1
-1
lines changedLines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
72 | 72 | | |
73 | 73 | | |
74 | 74 | | |
75 | | - | |
| 75 | + | |
76 | 76 | | |
77 | 77 | | |
78 | 78 | | |
| |||
0 commit comments