Skip to content

Commit ed67bf4

Browse files
Merge pull request #350 from lilic/failing-degraded
docs/dev/clusteroperator.md: Change Failing to Degraded
2 parents 07e65a3 + bf4fbd0 commit ed67bf4

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

docs/dev/clusteroperator.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -148,43 +148,43 @@ If this is false, it means the operator is not trying to apply any new state.
148148
If it remains true for an extended period of time, it suggests something is wrong in the cluster. It can probably wait until Monday.
149149
* `Available` must be true if the operand is functional and available in the cluster at the level in status.
150150
If this is false, it means there is an outage. Someone is probably getting paged.
151-
* `Failing` should be true if the operator has encountered an error that is preventing it or its operand from working properly.
151+
* `Degraded` should be true if the operator has encountered an error that is preventing it or its operand from working properly.
152152
The operand may still be available, but intent may not have been fulfilled.
153153
If this is true, it means that the operand is at risk of an outage or improper configuration. It can probably wait until the morning, but someone needs to look at it.
154154

155-
The message reported for each of these conditions is important. All messages should start with a capital letter (like a sentence) and be written for an end user / admin to debug the problem. `Failing` should describe in detail (a few sentences at most) why the current controller is blocked. The detail should be sufficient for an engineer or support person to triage the problem. `Available` should convey useful information about what is available, and be a single sentence without punctuation. `Progressing` is the most important message because it is shown by default in the CLI as a column and should be a terse, human-readable message describing the current state of the object in 5-10 words (the more succinct the better).
155+
The message reported for each of these conditions is important. All messages should start with a capital letter (like a sentence) and be written for an end user / admin to debug the problem. `Degraded` should describe in detail (a few sentences at most) why the current controller is blocked. The detail should be sufficient for an engineer or support person to triage the problem. `Available` should convey useful information about what is available, and be a single sentence without punctuation. `Progressing` is the most important message because it is shown by default in the CLI as a column and should be a terse, human-readable message describing the current state of the object in 5-10 words (the more succinct the better).
156156

157157
For instance, if the CVO is working towards 4.0.1 and has already successfully deployed 4.0.0, the conditions might be reporting:
158158

159-
* `Failing` is false with no message
159+
* `Degraded` is false with no message
160160
* `Available` is true with message `Cluster has deployed 4.0.0`
161161
* `Progressing` is true with message `Working towards 4.0.1`
162162

163163
If the controller reaches 4.0.1, the conditions might be:
164164

165-
* `Failing` is false with no message
165+
* `Degraded` is false with no message
166166
* `Available` is true with message `Cluster has deployed 4.0.1`
167167
* `Progressing` is false with message `Cluster version is 4.0.1`
168168

169169
If an error blocks reaching 4.0.1, the conditions might be:
170170

171-
* `Failing` is true with a detailed message `Unable to apply 4.0.1: could not update 0000_70_network_deployment.yaml because the resource type NetworkConfig has not been installed on the server.`
171+
* `Degraded` is true with a detailed message `Unable to apply 4.0.1: could not update 0000_70_network_deployment.yaml because the resource type NetworkConfig has not been installed on the server.`
172172
* `Available` is true with message `Cluster has deployed 4.0.0`
173173
* `Progressing` is true with message `Unable to apply 4.0.1: a required object is missing`
174174

175-
The progressing message is the first message a human will see when debugging an issue, so it should be terse, succinct, and summarize the problem well. The failing message can be more verbose. Start with simple, easy to understand messages and grow them over time to capture more detail.
175+
The progressing message is the first message a human will see when debugging an issue, so it should be terse, succinct, and summarize the problem well. The degraded message can be more verbose. Start with simple, easy to understand messages and grow them over time to capture more detail.
176176

177177

178178
#### Conditions and Install/Upgrade
179179

180180
Conditions determine when the CVO considers certain actions complete, the following table summarizes what it looks at and when.
181181

182182

183-
| operation | version | available | degraded | progressing |
183+
| operation | version | available | degraded | progressing |
184184
|-----------|---------|-----------|----------|-------------|
185185
| Install completion[1] | current(whatever was being installed) | true | any | any
186186
| Begin upgrade | any | any | any | any
187-
| Begin upgrade (w/ force) | any | any | any | any
187+
| Begin upgrade (w/ force) | any | any | any | any
188188
| Upgrade completion[2]| newVersion(target version for the upgrade) | true | false | false
189189

190190
[1] Install works on all components in parallel, it does not wait for any component to complete before starting another one.

0 commit comments

Comments
 (0)