Skip to content

Commit 7d703e3

Browse files
Merge pull request #371 from wking/no-desired-image
pkg/cvo: Set NoDesiredImage reason when desired.Image is empty
2 parents fc25a6f + 1249588 commit 7d703e3

File tree

4 files changed

+36
-11
lines changed

4 files changed

+36
-11
lines changed

docs/user/reconciliation.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -93,22 +93,22 @@ So the graph nodes are all parallelized with the by-number ordering flattened ou
9393

9494
For the usual reconciliation loop (neither an upgrade between releases nor a fresh install), the flattened graph is also randomly permuted to avoid hanging on ordering bugs.
9595

96-
## Synchronizing the graph
96+
## Reconciling the graph
9797

9898
The cluster-version operator spawns worker goroutines that walk the graph, pushing manifests in their queue.
99-
For each manifest in the node, the worker synchronizes the cluster with the manifest using a resource builder.
99+
For each manifest in the node, the worker reconciles the cluster with the manifest using a resource builder.
100100
On error (or timeout), the worker abandons the manifest, graph node, and any dependencies of that graph node.
101101
On success, the worker proceeds to the next manifest in the graph node.
102102

103103
## Resource builders
104104

105-
Resource builders synchronize the cluster with a manifest from the release image.
105+
Resource builders reconcile a cluster object with a manifest from the release image.
106106
The general approach is to generates a merged manifest combining critical spec properties from the release-image manifest with data from a preexisting in-cluster object, if any.
107107
If the merged manifest differs from the in-cluster object, the merged manifest is pushed back into the cluster.
108108

109109
Some types have additional logic, as described in the following subsections.
110110
Note that this logic only applies to manifests included in the release image itself.
111-
For example, only [ClusterOperator](../dev/clusteroperator.md) from the release image will have the blocking logic described [below](#clusteroperator); if an admin or secondary operator pushed a ClusterOperator object, it would not impact the cluster-version operator's graph synchronization.
111+
For example, only [ClusterOperator](../dev/clusteroperator.md) from the release image will have the blocking logic described [below](#clusteroperator); if an admin or secondary operator pushed a ClusterOperator object, it would not impact the cluster-version operator's graph reconciliation.
112112

113113
### ClusterOperator
114114

docs/user/status.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,27 @@
33
[The ClusterVersion object](../dev/clusterversion.md) sets `conditions` describing the state of the cluster-version operator (CVO).
44
This document describes those conditions and, where appropriate, suggests possible mitigations.
55

6+
## Failing
7+
8+
When `Failing` is True, the CVO is failing to reconcile the cluster with the desired release image.
9+
In all cases, the impact on the cluster will be that dependent nodes in [the manifest graph](reconciliation.md#manifest-graph) may not be [reconciled](reconciliation.md#reconciling-the-graph).
10+
Note that the graph [may be flattened](reconciliation.md#manifest-graph), in which case there are no dependent nodes.
11+
12+
Most reconciliation errors will result in `Failing=True`, although [`ClusterOperatorNotAvailable`](#clusteroperatornotavailable) has special handling.
13+
14+
### NoDesiredImage
15+
16+
The CVO has not been given a release image to reconcile.
17+
18+
If this happens it is a CVO coding error, because clearing [`desiredUpdate`][api-desired-update] should return you to the current CVO's release image.
19+
20+
### ClusterOperatorNotAvailable
21+
22+
`ClusterOperatorNotAvailable` (or the consolidated `ClusterOperatorsNotAvailable`) is set when the CVO fails to retrieve the ClusterOperator from the cluster or when the retrieved ClusterOperator does not satisfy [the reconciliation conditions](reconciliation.md#clusteroperator).
23+
24+
Unlike most manifest-reconciliation failures, this error does not immediately result in `Failing=True`.
25+
Under some conditions during installs and updates, the CVO will treat this condition as a `Progressing=True` condition and give the operator up to ten minutes to level before reporting `Failing=True`.
26+
627
## RetrievedUpdates
728

829
When `RetrievedUpdates` is `True`, the CVO is succesfully retrieving updates, which is good.
@@ -107,5 +128,6 @@ If this error occurs because you forced an update to a release that is not in an
107128
If this happens it is a CVO coding error.
108129
There is no mitigation short of updating to a new release image with a fixed CVO.
109130

131+
[api-desired-update]: https://github.com/openshift/api/blob/34f54f12813aaed8822bb5bc56e97cbbfa92171d/config/v1/types_cluster_version.go#L40-L54
110132
[channels]: https://docs.openshift.com/container-platform/4.3/updating/updating-cluster-between-minor.html#understanding-upgrade-channels_updating-cluster-between-minor
111133
[Cincinnati]: https://github.com/openshift/cincinnati/blob/master/docs/design/openshift.md

pkg/cvo/cvo.go

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -472,7 +472,10 @@ func (optr *Operator) sync(key string) error {
472472
// handle the case of a misconfigured CVO by doing nothing
473473
if len(desired.Image) == 0 {
474474
return optr.syncStatus(original, config, &SyncWorkerStatus{
475-
Failure: fmt.Errorf("No configured operator version, unable to update cluster"),
475+
Failure: &payload.UpdateError{
476+
Reason: "NoDesiredImage",
477+
Message: "No configured operator version, unable to update cluster",
478+
},
476479
}, errs)
477480
}
478481

pkg/cvo/cvo_scenarios_test.go

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -165,8 +165,8 @@ func TestCVO_StartupAndSync(t *testing.T) {
165165
Conditions: []configv1.ClusterOperatorStatusCondition{
166166
{Type: configv1.OperatorAvailable, Status: configv1.ConditionFalse},
167167
// report back to the user that we don't have enough info to proceed
168-
{Type: ClusterStatusFailing, Status: configv1.ConditionTrue, Message: "No configured operator version, unable to update cluster"},
169-
{Type: configv1.OperatorProgressing, Status: configv1.ConditionTrue, Message: "Unable to apply <unknown>: an error occurred"},
168+
{Type: ClusterStatusFailing, Status: configv1.ConditionTrue, Reason: "NoDesiredImage", Message: "No configured operator version, unable to update cluster"},
169+
{Type: configv1.OperatorProgressing, Status: configv1.ConditionTrue, Reason: "NoDesiredImage", Message: "Unable to apply <unknown>: an unknown error has occurred: NoDesiredImage"},
170170
{Type: configv1.RetrievedUpdates, Status: configv1.ConditionFalse},
171171
},
172172
},
@@ -436,8 +436,8 @@ func TestCVO_StartupAndSyncUnverifiedPayload(t *testing.T) {
436436
Conditions: []configv1.ClusterOperatorStatusCondition{
437437
{Type: configv1.OperatorAvailable, Status: configv1.ConditionFalse},
438438
// report back to the user that we don't have enough info to proceed
439-
{Type: ClusterStatusFailing, Status: configv1.ConditionTrue, Message: "No configured operator version, unable to update cluster"},
440-
{Type: configv1.OperatorProgressing, Status: configv1.ConditionTrue, Message: "Unable to apply <unknown>: an error occurred"},
439+
{Type: ClusterStatusFailing, Status: configv1.ConditionTrue, Reason: "NoDesiredImage", Message: "No configured operator version, unable to update cluster"},
440+
{Type: configv1.OperatorProgressing, Status: configv1.ConditionTrue, Reason: "NoDesiredImage", Message: "Unable to apply <unknown>: an unknown error has occurred: NoDesiredImage"},
441441
{Type: configv1.RetrievedUpdates, Status: configv1.ConditionFalse},
442442
},
443443
},
@@ -697,8 +697,8 @@ func TestCVO_StartupAndSyncPreconditionFailing(t *testing.T) {
697697
Conditions: []configv1.ClusterOperatorStatusCondition{
698698
{Type: configv1.OperatorAvailable, Status: configv1.ConditionFalse},
699699
// report back to the user that we don't have enough info to proceed
700-
{Type: ClusterStatusFailing, Status: configv1.ConditionTrue, Message: "No configured operator version, unable to update cluster"},
701-
{Type: configv1.OperatorProgressing, Status: configv1.ConditionTrue, Message: "Unable to apply <unknown>: an error occurred"},
700+
{Type: ClusterStatusFailing, Status: configv1.ConditionTrue, Reason: "NoDesiredImage", Message: "No configured operator version, unable to update cluster"},
701+
{Type: configv1.OperatorProgressing, Status: configv1.ConditionTrue, Reason: "NoDesiredImage", Message: "Unable to apply <unknown>: an unknown error has occurred: NoDesiredImage"},
702702
{Type: configv1.RetrievedUpdates, Status: configv1.ConditionFalse},
703703
},
704704
},

0 commit comments

Comments
 (0)