|
| 1 | +# Reconciliation |
| 2 | + |
| 3 | +This document describes the cluster-version operator's reconciliation logic and explains how the operator applies a release image to the cluster. |
| 4 | + |
| 5 | +## Release image content |
| 6 | + |
| 7 | +```console |
| 8 | +$ mkdir /tmp/release |
| 9 | +$ oc image extract quay.io/openshift-release-dev/ocp-release:4.1.0[-1] --path /:/tmp/release |
| 10 | +$ ls /tmp/release/release-manifests |
| 11 | +0000_03_authorization-openshift_01_rolebindingrestriction.crd.yaml |
| 12 | +0000_03_quota-openshift_01_clusterresourcequota.crd.yaml |
| 13 | +0000_03_security-openshift_01_scc.crd.yaml |
| 14 | +0000_05_config-operator_02_apiserver.cr.yaml |
| 15 | +0000_05_config-operator_02_authentication.cr.yaml |
| 16 | +... |
| 17 | +0000_90_openshift-controller-manager-operator_02_servicemonitor.yaml |
| 18 | +0000_90_openshift-controller-manager-operator_03_operand-servicemonitor.yaml |
| 19 | +image-references |
| 20 | +release-metadata |
| 21 | +$ cat /tmp/release/release-manifests/release-metadata |
| 22 | +{ |
| 23 | + "kind": "cincinnati-metadata-v0", |
| 24 | + "version": "4.1.0", |
| 25 | + "previous": [], |
| 26 | + "metadata": { |
| 27 | + "description": "", |
| 28 | + "url": "https://access.redhat.com/errata/RHBA-2019:0758" |
| 29 | + } |
| 30 | +} |
| 31 | +$ cat /tmp/release/release-manifests/image-references |
| 32 | +{ |
| 33 | + "kind": "ImageStream", |
| 34 | + "apiVersion": "image.openshift.io/v1", |
| 35 | + "metadata": { |
| 36 | + "name": "4.1.0", |
| 37 | + "creationTimestamp": "2019-06-03T14:49:14Z", |
| 38 | + "annotations": { |
| 39 | + "release.openshift.io/from-image-stream": "ocp/4.1-art-latest-2019-05-31-174150", |
| 40 | + "release.openshift.io/from-release": "registry.svc.ci.openshift.org/ocp/release:4.1.0-0.nightly-2019-05-31-174150" |
| 41 | + } |
| 42 | + }, |
| 43 | + "spec": { |
| 44 | + "lookupPolicy": { |
| 45 | + "local": false |
| 46 | + }, |
| 47 | + "tags": [ |
| 48 | + { |
| 49 | + "name": "aws-machine-controllers", |
| 50 | + "annotations": { |
| 51 | + "io.openshift.build.commit.id": "d8d8e285fc19920c3311e791f4fe22db7003588f", |
| 52 | + "io.openshift.build.commit.ref": "", |
| 53 | + "io.openshift.build.source-location": "https://github.com/openshift/cluster-api-provider-aws" |
| 54 | + }, |
| 55 | + "from": { |
| 56 | + "kind": "DockerImage", |
| 57 | + "name": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7483248489c918e0c65a6b391bd171da0565cb9995b2acc61a1e517b6551e037" |
| 58 | + }, |
| 59 | + "generation": 2, |
| 60 | + "importPolicy": {}, |
| 61 | + "referencePolicy": { |
| 62 | + "type": "Source" |
| 63 | + } |
| 64 | + }, |
| 65 | + ... |
| 66 | + ] |
| 67 | + }, |
| 68 | + "status": { |
| 69 | + "dockerImageRepository": "" |
| 70 | + } |
| 71 | +} |
| 72 | +``` |
| 73 | + |
| 74 | +## Manifest graph |
| 75 | + |
| 76 | +The cluster-version operator unpacks the release image, ingests manifests, loads them into a graph. |
| 77 | +For upgrades, the graph is ordered by the number and component of the manifest file: |
| 78 | + |
| 79 | +<div style="text-align:center"> |
| 80 | + <img src="tasks-by-number-and-component.svg" width="100%" /> |
| 81 | +</div> |
| 82 | + |
| 83 | +The `0000_03_authorization-openshift_*` manifest gets its own node, the `0000_03_quota-openshift_01_*` manifest gets its own node, and the `0000_03_security-openshift_*` manifest gets its own node. |
| 84 | +The next group of manifests are under `0000_05_config-operator_*`. |
| 85 | +Because the number is bumped, the graph blocks until the previous `0000_03_*` are all complete before beginning the `0000_05_*` block. |
| 86 | + |
| 87 | +We are more relaxed for the initial install, because there is not yet any user data in the cluster to be worried about. |
| 88 | +So the graph nodes are all parallelized with the by-number ordering flattened out: |
| 89 | + |
| 90 | +<div style="text-align:center"> |
| 91 | + <img src="tasks-flatten-by-number-and-component.svg" width="100%" /> |
| 92 | +</div> |
| 93 | + |
| 94 | +For the usual reconciliation loop (neither an upgrade between releases nor a fresh install), the flattened graph is also randomly permuted to avoid hanging on ordering bugs. |
| 95 | + |
| 96 | +## Synchronizing the graph |
| 97 | + |
| 98 | +The cluster-version operator spawns worker goroutines that walk the graph, pushing manifests in their queue. |
| 99 | +For each manifest in the node, the worker synchronizes the cluster with the manifest using a resource builder. |
| 100 | +On error (or timeout), the worker abandons the manifest, graph node, and any dependencies of that graph node. |
| 101 | +On success, the worker proceeds to the next manifest in the graph node. |
| 102 | + |
| 103 | +## Resource builders |
| 104 | + |
| 105 | +Resource builders synchronize the cluster with a manifest from the release image. |
| 106 | +The general approach is to generates a merged manifest combining critical spec properties from the release-image manifest with data from a preexisting in-cluster object, if any. |
| 107 | +If the merged manifest differs from the in-cluster object, the merged manifest is pushed back into the cluster. |
| 108 | + |
| 109 | +Some types have additional logic, as described in the following subsections. |
| 110 | +Note that this logic only applies to manifests included in the release image itself. |
| 111 | +For example, only [ClusterOperator](../dev/clusteroperator.md) from the release image will have the blocking logic described [below](#clusteroperator); if an admin or secondary operator pushed a ClusterOperator object, it would not impact the cluster-version operator's graph synchronization. |
| 112 | + |
| 113 | +### ClusterOperator |
| 114 | + |
| 115 | +The cluster-version operator does not push [ClusterOperator](../dev/clusteroperator.md) into the cluster. |
| 116 | +Instead, the operators create ClusterOperator themselves. |
| 117 | +The ClusterOperator builder only monitors the in-cluster object and blocks until it is: |
| 118 | + |
| 119 | +* Available |
| 120 | +* Either not progressing or, when the release image manifest has `status.versions` entries, listing at least the versions given in that manifest. |
| 121 | + For example, an OpenShift API server ClusterOperator entry in the release image like: |
| 122 | + |
| 123 | + ```yaml |
| 124 | + apiVersion: config.openshift.io/v1 |
| 125 | + kind: ClusterOperator |
| 126 | + metadata: |
| 127 | + name: openshift-apiserver |
| 128 | + spec: {} |
| 129 | + status: |
| 130 | + versions: |
| 131 | + - name: operator |
| 132 | + version: "4.1.0" |
| 133 | + ``` |
| 134 | +
|
| 135 | + would block until the in-cluster ClusterOperator reported `operator` at version 4.1.0. |
| 136 | + |
| 137 | + The progressing check is deprecated and will be removed once all operators are reporting versions. |
| 138 | +* Not degraded (except during initialization, where we ignore the degraded status) |
| 139 | + |
| 140 | +### CustomResourceDefinition |
| 141 | + |
| 142 | +After pushing the merged CustomResourceDefinition into the cluster, the builder monitors the in-cluster object and blocks until it is established. |
| 143 | + |
| 144 | +### DaemonSet |
| 145 | + |
| 146 | +The builder does not block after an initial DaemonSet push (when the in-cluster object has generation 1). |
| 147 | + |
| 148 | +For subsequent updates, the builder blocks until: |
| 149 | + |
| 150 | +* The in-cluster object's observed generation catches up with the specified generation. |
| 151 | +* Pods with the release-image-specified configuration are scheduled on each node. |
| 152 | +* There are no nodes without available, ready pods. |
| 153 | + |
| 154 | +### Deployment |
| 155 | + |
| 156 | +The builder does not block after an initial Deployment push (when the in-cluster object has generation 1). |
| 157 | + |
| 158 | +For subsequent updates, the builder blocks until: |
| 159 | + |
| 160 | +* The in-cluster object's observed generation catches up with the specified generation. |
| 161 | +* Sufficient pods with the release-image-specified configuration are scheduled to fulfill the requested `replicas`. |
| 162 | +* There are no unavailable replicas. |
| 163 | + |
| 164 | +### Job |
| 165 | + |
| 166 | +After pushing the merged Job into the cluster, the builder blocks until the Job succeeds. |
0 commit comments