Skip to content

Commit 904d9ad

Browse files
committed
OSDOCS-6630: second iteration of how updates work doc
1 parent 41b6fa7 commit 904d9ad

File tree

6 files changed

+160
-101
lines changed

6 files changed

+160
-101
lines changed
Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * updating/understanding_updates/how-updates-work.adoc
4+
5+
:_content-type: CONCEPT
6+
[id="update-cluster-version-object_{context}"]
7+
= The ClusterVersion object
8+
9+
One of the resources that the Cluster Version Operator (CVO) monitors is the `ClusterVersion` resource.
10+
11+
Administrators and OpenShift components can communicate or interact with the CVO through the `ClusterVersion` object.
12+
The desired CVO state is declared through the `ClusterVersion` object and the current CVO state is reflected in the object's status.
13+
14+
[NOTE]
15+
====
16+
Do not directly modify the `ClusterVersion` object. Instead, use interfaces such as the `oc` CLI or the web console to declare your update target.
17+
====
18+
19+
The CVO continually reconciles the cluster with the target state declared in the `spec` property of the `ClusterVersion` resource.
20+
When the desired release differs from the actual release, that reconciliation updates the cluster.
21+
22+
//to-do: this might be heading overload, consider deleting this heading if the context switch from the previous paragraph to this content is smooth enough to not require one.
23+
[discrete]
24+
== Update availability data
25+
26+
The `ClusterVersion` resource also contains information about updates that are available to the cluster.
27+
This includes updates that are available, but not recommended due to a known risk that applies to the cluster.
28+
These updates are known as conditional updates.
29+
To learn how the CVO maintains this information about available updates in the `ClusterVersion` resource, see the "Evaluation of update availability" section.
30+
31+
* You can inspect all available updates with the following command:
32+
+
33+
[source,terminal]
34+
----
35+
$ oc adm upgrade --include-not-recommended
36+
----
37+
+
38+
[NOTE]
39+
====
40+
The additional `--include-not-recommended` parameter includes updates that are available but not recommended due to a known risk that applies to the cluster.
41+
====
42+
+
43+
.Example output
44+
[source,terminal]
45+
----
46+
Cluster version is 4.10.22
47+
48+
Upstream is unset, so the cluster will use an appropriate default.
49+
Channel: fast-4.11 (available channels: candidate-4.10, candidate-4.11, eus-4.10, fast-4.10, fast-4.11, stable-4.10)
50+
51+
Recommended updates:
52+
53+
VERSION IMAGE
54+
4.10.26 quay.io/openshift-release-dev/ocp-release@sha256:e1fa1f513068082d97d78be643c369398b0e6820afab708d26acda2262940954
55+
4.10.25 quay.io/openshift-release-dev/ocp-release@sha256:ed84fb3fbe026b3bbb4a2637ddd874452ac49c6ead1e15675f257e28664879cc
56+
4.10.24 quay.io/openshift-release-dev/ocp-release@sha256:aab51636460b5a9757b736a29bc92ada6e6e6282e46b06e6fd483063d590d62a
57+
4.10.23 quay.io/openshift-release-dev/ocp-release@sha256:e40e49d722cb36a95fa1c03002942b967ccbd7d68de10e003f0baa69abad457b
58+
59+
Supported but not recommended updates:
60+
61+
Version: 4.11.0
62+
Image: quay.io/openshift-release-dev/ocp-release@sha256:300bce8246cf880e792e106607925de0a404484637627edf5f517375517d54a4
63+
Recommended: False
64+
Reason: RPMOSTreeTimeout
65+
Message: Nodes with substantial numbers of containers and CPU contention may not reconcile machine configuration https://bugzilla.redhat.com/show_bug.cgi?id=2111817#c22
66+
----
67+
+
68+
The `oc adm upgrade` command queries the `ClusterVersion` resource for information about available updates and presents it in a human-readable format.
69+
70+
* One way to directly inspect the underlying availability data created by the CVO is by querying the `ClusterVersion` resource with the following command:
71+
+
72+
[source,terminal]
73+
----
74+
$ oc get clusterversion version -o json | jq '.status.availableUpdates'
75+
----
76+
+
77+
.Example output
78+
[source,terminal]
79+
----
80+
[
81+
{
82+
"channels": [
83+
"candidate-4.11",
84+
"candidate-4.12",
85+
"fast-4.11",
86+
"fast-4.12"
87+
],
88+
"image": "quay.io/openshift-release-dev/ocp-release@sha256:400267c7f4e61c6bfa0a59571467e8bd85c9188e442cbd820cc8263809be3775",
89+
"url": "https://access.redhat.com/errata/RHBA-2023:3213",
90+
"version": "4.11.41"
91+
},
92+
...
93+
]
94+
----
95+
96+
* A similar command can be used to check conditional updates:
97+
+
98+
[source,terminal]
99+
----
100+
$ oc get clusterversion version -o json | jq '.status.conditionalUpdates'
101+
----
102+
+
103+
.Example output
104+
[source,terminal]
105+
----
106+
[
107+
{
108+
"conditions": [
109+
{
110+
"lastTransitionTime": "2023-05-30T16:28:59Z",
111+
"message": "The 4.11.36 release only resolves an installation issue https://issues.redhat.com//browse/OCPBUGS-11663 , which does not affect already running clusters. 4.11.36 does not include fixes delivered in recent 4.11.z releases and therefore upgrading from these versions would cause fixed bugs to reappear. Red Hat does not recommend upgrading clusters to 4.11.36 version for this reason. https://access.redhat.com/solutions/7007136",
112+
"reason": "PatchesOlderRelease",
113+
"status": "False",
114+
"type": "Recommended"
115+
}
116+
],
117+
"release": {
118+
"channels": [...],
119+
"image": "quay.io/openshift-release-dev/ocp-release@sha256:8c04176b771a62abd801fcda3e952633566c8b5ff177b93592e8e8d2d1f8471d",
120+
"url": "https://access.redhat.com/errata/RHBA-2023:1733",
121+
"version": "4.11.36"
122+
},
123+
"risks": [...]
124+
},
125+
...
126+
]
127+
----

modules/update-cvo.adoc

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * updating/understanding_updates/how-updates-work.adoc
4+
5+
:_content-type: CONCEPT
6+
[id="update-cvo_{context}"]
7+
= The Cluster Version Operator
8+
9+
// adding a poorly written, technically inaccurate skeleton of a module for now, which can be replaced/refined by SMEs as they see fit
10+
11+
The Cluster Version Operator (CVO) is the primary component that orchestrates and facilitates the {product-title} update process.
12+
During installation and standard cluster operation, the CVO is constantly comparing the manifests of managed cluster Operators to in-cluster resources, and reconciling discrepancies to ensure that the actual state of these resources match their desired state.

modules/update-evaluate-availability.adoc

Lines changed: 0 additions & 93 deletions
Original file line numberDiff line numberDiff line change
@@ -19,96 +19,3 @@ If the CVO finds that the cluster does not match the risks of an update, or that
1919

2020
The user interface, either the web console or the OpenShift CLI (`oc`), presents this information in sectioned headings to the administrator.
2121
Each *supported but not recommended* update recommendation contains a link to further resources about the risk so that the administrator can make an informed decision about the update.
22-
23-
You can inspect all available updates with the following command:
24-
25-
[source,terminal]
26-
----
27-
$ oc adm upgrade --include-not-recommended
28-
----
29-
30-
The additional `--include-not-recommended` parameter includes updates that are available but not recommended due to a known risk that applies to the cluster.
31-
32-
.Example output
33-
[source,terminal]
34-
----
35-
Cluster version is 4.10.22
36-
37-
Upstream is unset, so the cluster will use an appropriate default.
38-
Channel: fast-4.11 (available channels: candidate-4.10, candidate-4.11, eus-4.10, fast-4.10, fast-4.11, stable-4.10)
39-
40-
Recommended updates:
41-
42-
VERSION IMAGE
43-
4.10.26 quay.io/openshift-release-dev/ocp-release@sha256:e1fa1f513068082d97d78be643c369398b0e6820afab708d26acda2262940954
44-
4.10.25 quay.io/openshift-release-dev/ocp-release@sha256:ed84fb3fbe026b3bbb4a2637ddd874452ac49c6ead1e15675f257e28664879cc
45-
4.10.24 quay.io/openshift-release-dev/ocp-release@sha256:aab51636460b5a9757b736a29bc92ada6e6e6282e46b06e6fd483063d590d62a
46-
4.10.23 quay.io/openshift-release-dev/ocp-release@sha256:e40e49d722cb36a95fa1c03002942b967ccbd7d68de10e003f0baa69abad457b
47-
48-
Supported but not recommended updates:
49-
50-
Version: 4.11.0
51-
Image: quay.io/openshift-release-dev/ocp-release@sha256:300bce8246cf880e792e106607925de0a404484637627edf5f517375517d54a4
52-
Recommended: False
53-
Reason: RPMOSTreeTimeout
54-
Message: Nodes with substantial numbers of containers and CPU contention may not reconcile machine configuration https://bugzilla.redhat.com/show_bug.cgi?id=2111817#c22
55-
----
56-
57-
One way to inspect the underlying availability data created by the CVO is by querying the `ClusterVersion` resource with the following command:
58-
59-
[source,terminal]
60-
----
61-
$ oc get clusterversion version -o json | jq '.status.availableUpdates'
62-
----
63-
64-
.Example output
65-
[source,terminal]
66-
----
67-
[
68-
{
69-
"channels": [
70-
"candidate-4.11",
71-
"candidate-4.12",
72-
"fast-4.11",
73-
"fast-4.12"
74-
],
75-
"image": "quay.io/openshift-release-dev/ocp-release@sha256:400267c7f4e61c6bfa0a59571467e8bd85c9188e442cbd820cc8263809be3775",
76-
"url": "https://access.redhat.com/errata/RHBA-2023:3213",
77-
"version": "4.11.41"
78-
},
79-
...
80-
]
81-
----
82-
83-
A similar command can be used to check conditional updates:
84-
85-
[source,terminal]
86-
----
87-
$ oc get clusterversion version -o json | jq '.status.conditionalUpdates'
88-
----
89-
90-
.Example output
91-
[source,terminal]
92-
----
93-
[
94-
{
95-
"conditions": [
96-
{
97-
"lastTransitionTime": "2023-05-30T16:28:59Z",
98-
"message": "The 4.11.36 release only resolves an installation issue https://issues.redhat.com//browse/OCPBUGS-11663 , which does not affect already running clusters. 4.11.36 does not include fixes delivered in recent 4.11.z releases and therefore upgrading from these versions would cause fixed bugs to reappear. Red Hat does not recommend upgrading clusters to 4.11.36 version for this reason. https://access.redhat.com/solutions/7007136",
99-
"reason": "PatchesOlderRelease",
100-
"status": "False",
101-
"type": "Recommended"
102-
}
103-
],
104-
"release": {
105-
"channels": [...],
106-
"image": "quay.io/openshift-release-dev/ocp-release@sha256:8c04176b771a62abd801fcda3e952633566c8b5ff177b93592e8e8d2d1f8471d",
107-
"url": "https://access.redhat.com/errata/RHBA-2023:1733",
108-
"version": "4.11.36"
109-
},
110-
"risks": [...]
111-
},
112-
...
113-
]
114-
----

modules/update-manifest-application.adoc

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -38,24 +38,27 @@ The CVO then applies manifests following the generated dependency graph.
3838
[NOTE]
3939
====
4040
For some resource types, the CVO monitors the resource after its manifest is applied, and considers it to be successfully updated only after the resource reaches a stable state.
41-
Achieving this stable state can take some time.
42-
This is especially true for cluster Operators, which might perform their own update actions in the cluster after the CVO deploys their new versions.
43-
While the additional update actions take place, these cluster Operators temporarily set their `Progressing` condition to `True`.
41+
Achieving this state can take some time.
42+
This is especially true for `ClusterOperator` resources, while the CVO waits for a cluster Operator to update itself and then update its `ClusterOperator` status.
4443
====
4544

45+
// to do: potentially reword the note above to clarify that specific resources are being applied at one time, and not necessarily all the resources for that component.
46+
4647
The CVO waits until all cluster Operators in the Runlevel meet the following conditions before it proceeds to the next Runlevel:
4748

4849
* The cluster Operators have an `Available=True` condition.
4950
5051
* The cluster Operators have a `Degraded=False` condition.
5152
53+
// to do: potentially clarify that this condition is not applicable during installations, and also potentially add documentation (here or elsewhere) that explains how the CVO is constantly reconciling states whether or not an update is happening.
54+
5255
* The cluster Operators declare they have achieved the desired version in their ClusterOperator resource.
5356
5457
Some actions can take significant time to finish. The CVO waits for the actions to complete in order to ensure the subsequent Runlevels can proceed safely.
55-
The process of applying all manifests is expected to take 60 to 120 minutes in total; see *Understanding {product-title} update duration* for more information about factors that influence update duration.
58+
Initially reconciling the new release's manifests is expected to take 60 to 120 minutes in total; see *Understanding {product-title} update duration* for more information about factors that influence update duration.
5659

5760
image::update-runlevels.png[A diagram displaying the sequence of Runlevels and the manifests of components within each level]
5861

5962
In the previous example diagram, the CVO is waiting until all work is completed at Runlevel 20.
6063
The CVO has applied all manifests to the Operators in the Runlevel, but the `kube-apiserver-operator ClusterOperator` performs some actions after its new version was deployed. The `kube-apiserver-operator ClusterOperator` declares this progress through the `Progressing=True` condition and by not declaring the new version as reconciled in its `status.versions`.
61-
The CVO waits until the ClusterOperator reports an acceptable status, and then it will start applying manifests at Runlevel 25.
64+
The CVO waits until the ClusterOperator reports an acceptable status, and then it will start reconciling manifests at Runlevel 25.

modules/update-process-workflow.adoc

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,14 +27,18 @@ The job then extracts the manifests and metadata from the release image to a sha
2727
Certain conditions can prevent updates from proceeding.
2828
These conditions are either determined by the CVO itself, or reported by individual cluster Operators that detect some details about the cluster that the Operator considers problematic for the update.
2929

30+
// to do: potentially add an example of a precondition to the bullet above.
31+
3032
. The CVO records the accepted release in `status.desired` and creates a `status.history` entry about the new update.
3133

32-
. The CVO begins applying the manifests from the release image.
34+
. The CVO begins reconciling the manifests from the release image.
3335
Cluster Operators are updated in separate stages called Runlevels, and the CVO ensures that all Operators in a Runlevel finish updating before it proceeds to the next level.
3436

3537
. Manifests for the CVO itself are applied early in the process.
3638
When the CVO deployment is applied, the current CVO pod terminates, and a CVO pod using the new version starts.
37-
The new CVO proceeds to apply the remaining manifests.
39+
The new CVO proceeds to reconcile the remaining manifests.
40+
41+
// to do: potentially replace some instances of "apply" in this doc with something like "reconcile" to imply that a lot of these processes are constantly repeating, rather than happening only once.
3842

3943
. The update proceeds until the entire control plane is updated to the new version.
4044
Individual cluster Operators might perform update tasks on their domain of the cluster, and while they do so, they report their state through the `Progressing=True` condition.

updating/understanding_updates/how-updates-work.adoc

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,14 @@ toc::[]
88

99
The following sections describe each major aspect of the {product-title} (OCP) update process in detail. For a general overview of how updates work, see the xref:../../updating/understanding_updates/intro-to-updates.adoc#understanding-openshift-updates[Introduction to OpenShift updates].
1010

11+
// The Cluster Version Operator
12+
include::modules/update-cvo.adoc[leveloffset=+1]
13+
14+
// The ClusterVersion object
15+
include::modules/update-cluster-version-object.adoc[leveloffset=+2]
16+
1117
// Evaluation of update availability
12-
include::modules/update-evaluate-availability.adoc[leveloffset=+1]
18+
include::modules/update-evaluate-availability.adoc[leveloffset=+2]
1319

1420
[role="_additional-resources"]
1521
.Additional resources

0 commit comments

Comments
 (0)