Skip to content

Commit e1ca813

Browse files
authored
Merge pull request #939 from tkashem/csv-reporting
(proposal) improved csv status reporting
2 parents f6932d0 + d7a9d57 commit e1ca813

File tree

1 file changed

+276
-0
lines changed

1 file changed

+276
-0
lines changed
Lines changed: 276 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,276 @@
1+
# CSV Reporting
2+
3+
## Motivation
4+
The `ClusterServiceVersion` `CustomResource` needs to report useful and contextual information to the user via the `status` sub-resource. An end user associates an operator with a `CSV`. The end user is primarily interested in learning about the status of the deployment or upgrade of the operator associated with the `CSV`. The following events among many others are of interest
5+
6+
* An instance of the operator managed by the `csv` is being installed (No previous version exists).
7+
* An operator is being upgraded to a desired version.
8+
* An operator has been successfully installed or upgraded.
9+
* Error happens while an operator install or upgrade is in progress.
10+
* An operator is being removed.
11+
12+
### Conventions
13+
In order to design a status that makes sense in the context of kubernetes resources, it's important to conform to current conventions. This will also help us avoid pitfalls that may have already been solved.
14+
15+
In light of this, `ClusterServiceVesrion` will have the following `Condition` type(s).
16+
17+
```go
18+
// ClusterServiceVersionConditionType is the state of the underlying operator.
19+
type ClusterServiceVersionConditionType string
20+
21+
const (
22+
// Available means that the underlying operator has been deployed successfully
23+
// and it has passed all liveness/readiness check(s) performed by olm.
24+
OperatorAvailable ClusterServiceVersionConditionType = "Available"
25+
26+
// Progressing means that the deployment of the underlying operator is in progress.
27+
OperatorProgressing ClusterServiceVersionConditionType = "Progressing"
28+
29+
// We can add more condition type(s) as wee see fit.
30+
)
31+
```
32+
33+
The current definition of `ClusterServiceVersionCondition` does not conform to kubernetes `status` conventions. We will make the following change(s) to make it conformant to current conventions.
34+
* Remove `LastUpdateTime` from `ClusterServiceVersionCondition`. There is no logic that depends on this field.
35+
* Remove `Phase`
36+
* Add `Type` of `ClusterServiceVersionConditionType` type.
37+
* Add `Status` of `corev1.ConditionStatus` type.
38+
39+
```go
40+
import (
41+
corev1 "k8s.io/kubernetes/pkg/apis/core/v1"
42+
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
43+
)
44+
45+
type ClusterServiceVersionCondition struct {
46+
// Type is the type of ClusterServiceVersionCondition condition.
47+
Type ClusterServiceVersionConditionType `json:"type" description:"type of ClusterServiceVersion condition"`
48+
49+
// Status is the status of the condition, one of True, False, Unknown.
50+
Status corev1.ConditionStatus `json:"status" description:"status of the condition, one of True, False, Unknown"`
51+
52+
// Reason is a one-word CamelCase reason for the condition's last transition.
53+
// +optional
54+
Reason ConditionReason `json:"reason,omitempty" description:"one-word CamelCase reason for the condition's last transition"`
55+
56+
// Message is a human-readable message indicating details about last transition.
57+
// +optional
58+
Message string `json:"message,omitempty" description:"human-readable message indicating details about last transition"`
59+
60+
// LastHeartbeatTime is the last time we got an update on a given condition
61+
// +optional
62+
LastHeartbeatTime *metav1.Time `json:"lastHeartbeatTime,omitempty" description:"last time we got an update on a given condition"`
63+
64+
// LastTransitionTime is the last time the condition transit from one status to another
65+
// +optional
66+
LastTransitionTime *metav1.Time `json:"lastTransitionTime,omitempty" description:"last time the condition transit from one status to another"`
67+
}
68+
```
69+
70+
### Current Phase
71+
A CSV transitions through a set of phase(s) during its life cycle. These phase(s) are internal to olm. Currently the following group of fields track the current phase.
72+
```go
73+
type ClusterServiceVersionStatus struct {
74+
// Current condition of the ClusterServiceVersion
75+
Phase ClusterServiceVersionPhase `json:"phase,omitempty"`
76+
// A human readable message indicating details about why the ClusterServiceVersion is in this condition.
77+
// +optional
78+
Message string `json:"message,omitempty"`
79+
// A brief CamelCase message indicating details about why the ClusterServiceVersion is in this state.
80+
// e.g. 'RequirementsNotMet'
81+
// +optional
82+
Reason ConditionReason `json:"reason,omitempty"`
83+
// Last time we updated the status
84+
// +optional
85+
LastUpdateTime metav1.Time `json:"lastUpdateTime,omitempty"`
86+
// Last time the status transitioned from one status to another.
87+
// +optional
88+
LastTransitionTime metav1.Time `json:"lastTransitionTime,omitempty"`
89+
}
90+
```
91+
92+
We can consolidate this set fields into a separate structure as follows
93+
```go
94+
type ClusterServiceVersionTransition struct {
95+
// Name of the phase.
96+
Phase ClusterServiceVersionPhase `json:"phase,omitempty"`
97+
// A human readable message indicating details about why the ClusterServiceVersion is in this phase.
98+
// +optional
99+
Message string `json:"message,omitempty"`
100+
// A brief CamelCase message indicating details about why the ClusterServiceVersion is in this phase.
101+
// e.g. 'RequirementsNotMet'
102+
// +optional
103+
Reason ConditionReason `json:"reason,omitempty"`
104+
// Last time we updated the status
105+
// +optional
106+
LastUpdateTime metav1.Time `json:"lastUpdateTime,omitempty"`
107+
// Last time the status transitioned to this phase.
108+
// +optional
109+
LastTransitionTime metav1.Time `json:"lastTransitionTime,omitempty"`
110+
}
111+
112+
type ClusterServiceVersionStatus struct {
113+
CurrentPhase ClusterServiceVersionTransition `json:"phase,omitempty"`
114+
}
115+
```
116+
117+
This will probably break the current UI since it expects the `phase` name to be directly under `status` for any CSV.
118+
119+
120+
### Transition History
121+
Today we keep appending new `ClusterServiceVersionCondition` to `Status.Conditions`. This does not conform to the conventions. `Status.Conditions` will now be a collection of `ClusterServiceVersionCondition` last observed. Typically it will have an item of `ClusterServiceVersionCondition` from each `ClusterServiceVersionConditionType`, as shown below.
122+
123+
```yaml
124+
conditions:
125+
- type: Progressing
126+
Status: True
127+
Message: Working towards v1.0.0
128+
- type: Available
129+
Status: False
130+
Reason: CSVReasonRequirementsUnknown
131+
Message: Scheduling ClusterServiceVersion for requirement verification
132+
```
133+
134+
If we want to continue to maintain a history of last N phase transition(s) or activities then we can add the following to `status`.
135+
```go
136+
type ClusterServiceVersionStatus struct {
137+
LastTransitions []ClusterServiceVersionTransition `json:"lastTransitions,omitempty"`
138+
}
139+
```
140+
141+
### Versions
142+
The `ClusterServiceVersion` resource needs to report the version of the operator it manages. The version information should have a `name` and a `version`. It must always match the currently installed version. If `v1.0.0` is currently installed, then this must indicate `v1.0.0` even if the associated `ClusterServiceVersion` is in the process of installing a new version `v1.1.0`.
143+
144+
```go
145+
type OperatorVersion struct {
146+
// Name is the name of the operator.
147+
Name string `json:"name"`
148+
149+
// Version of the operator currently installed.
150+
Version string `json:"version"`
151+
}
152+
```
153+
154+
```go
155+
type ClusterServiceVersionStatus struct {
156+
// List of conditions, a history of state transitions
157+
Conditions []ClusterServiceVersionCondition `json:"conditions,omitempty"`
158+
159+
// Version of the underlying operator.
160+
Version OperatorVersion `json:"version,omitempty"`
161+
}
162+
```
163+
164+
### Related Objects
165+
If an end user is looking at a csv, he/she should be able to access the related resource(s) associated with the csv. For example, the end user should be able to:
166+
* Refer to the `Subscription` object that is associated with the `csv`.
167+
* Refer to the `InstallPlan` object that created this `csv`. Is this useful during troubleshooting?
168+
* Refer to the `CatalogSource` object that contains the operator manifest.
169+
170+
To make it easier for the end user to access the related resource(s), the following change is being proposed.
171+
```go
172+
type ClusterServiceVersionStatus struct {
173+
// Option 1:
174+
175+
// InstallPlanRef is a reference to the InstallPlan that created this CSV.
176+
// +optional
177+
InstallPlanRef *corev1.ObjectReference `json:"installPlanRef,omitempty"`
178+
179+
// SubscriptionRef is a reference to the Subscription related to this CSV.
180+
// +optional
181+
SubscriptionRef *corev1.ObjectReference `json:"subscriptionRef,omitempty"`
182+
183+
// CatalogSourceRef is a reference to the CatalogSource related to this CSV.
184+
// +optional
185+
CatalogSourceRef *corev1.ObjectReference `json:"catalogSourceRef,omitempty"`
186+
187+
// Option 2
188+
// An array of ObjectReference pointing to the related object(s)
189+
RelatedObjects []*corev1.ObjectReference `json:"relatedObjects,omitempty"`
190+
}
191+
```
192+
193+
## User Experience
194+
Let's go through some of the use cases related to operator deployment and upgrade and see what portion of the `status` would look like to an end user/administrator.
195+
196+
### Use Case 1:
197+
A new operator is being installed and olm is running requirements check.
198+
```yaml
199+
status:
200+
...
201+
phase: Pending
202+
conditions:
203+
- type: Progressing
204+
Status: True
205+
Message: Working towards v1.0.0
206+
- type: Available
207+
Status: False
208+
Reason: CSVReasonRequirementsUnknown
209+
Message: Scheduling ClusterServiceVersion for requirement verification
210+
```
211+
`status.version` is empty since we don't have any version of the operator installed yet.
212+
213+
### Use Case 2:
214+
A new operator has been successfully installed, no previous version existed.
215+
```yaml
216+
status:
217+
...
218+
phase: Succeeded
219+
conditions:
220+
- type: Progressing
221+
Status: False
222+
Message: Deployed version v1.0.0
223+
- type: Available
224+
Status: True
225+
version:
226+
name: etcd
227+
version: 1.0.0
228+
```
229+
230+
### Use Case 3:
231+
An existing operator is being upgraded to a new version. We need to put more thoughts into this, what is below is just a rough sketch.
232+
```yaml
233+
// This is while upgrade is in progress.
234+
// Original ClusterServiceVersion that is being replaced.
235+
status:
236+
...
237+
phase: Replacing
238+
conditions:
239+
- type: Progressing
240+
Status: False
241+
- type: Available
242+
Status: False
243+
Reason: BeingReplaced
244+
version:
245+
name: etcd
246+
version: v1.0.0
247+
248+
// Head CSV
249+
status:
250+
...
251+
phase: Pending
252+
conditions:
253+
- type: Progressing
254+
Status: True
255+
Message: Working toward v2.0.0
256+
- type: Available
257+
Status: False
258+
version:
259+
name: etcd
260+
version: 1.0.0
261+
```
262+
`version` is set to `1.0.0` since this is the last version installed on the cluster. Once the upgrade is successful, the `head` csv status will look like this.
263+
```yaml
264+
status:
265+
...
266+
phase: Succeeded
267+
conditions:
268+
- type: Progressing
269+
Status: False
270+
Message: Deployed version v2.0.0
271+
- type: Available
272+
Status: True
273+
version:
274+
name: etcd
275+
version: 2.0.0
276+
```

0 commit comments

Comments
 (0)