Skip to content

Commit 18b6c0d

Browse files
Merge pull request #941 from ecordell/config-proposal
proposal(operator-config): initial proposal for persisting configuration
2 parents ed47bd7 + f6681de commit 18b6c0d

File tree

1 file changed

+338
-0
lines changed

1 file changed

+338
-0
lines changed
Lines changed: 338 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,338 @@
1+
# Operator Configuration
2+
3+
Status: Pending
4+
5+
Version: Alpha
6+
7+
Implementation Owner: tkashem
8+
9+
Prereqs: https://github.com/operator-framework/operator-lifecycle-manager/pull/931
10+
11+
## Motivation
12+
13+
Cluster administrators may need to configure operators beyond the defaults that come via an operator bundle from OLM, such as:
14+
15+
- Pod placement (node selectors)
16+
- Resource requirements and limits
17+
- Tolerations
18+
- Environment Variables
19+
- Specifically, Proxy configuration
20+
21+
## Proposal
22+
23+
This configuration could live in a new object, but we are beginning work to consolidate our APIs into a smaller surface. As part of that goal, we will hang new features off of the `Subscription` object.
24+
25+
In the future, as `Subscription` takes more of a front seat in the apis, it will likely get an alternate name (e.g. `Operator`).
26+
27+
### Subscription Spec Changes
28+
29+
A new section `config` is added to the SubscriptionSpec. (bikeshed note: `podConfig` may be more specific/descriptive)
30+
31+
```yaml
32+
kind: Subscription
33+
metadata:
34+
name: my-operator
35+
spec:
36+
pacakge: etcd
37+
channel: alpha
38+
39+
# new
40+
config:
41+
- selector:
42+
matchLabels:
43+
app: etcd-operator
44+
resources:
45+
requests:
46+
memory: "64Mi"
47+
cpu: "250m"
48+
limits:
49+
memory: "128Mi"
50+
cpu: "500m"
51+
nodeSelector:
52+
disktype: ssd
53+
tolerations:
54+
- key: "key"
55+
operator: "Equal"
56+
value: "value"
57+
effect: "NoSchedule"
58+
# provide application config via a configmap volume
59+
volumes:
60+
- name: config-volume
61+
configMap:
62+
name: etcd-operator-config
63+
volumeMounts:
64+
- mountPath: /config
65+
name: config-volume
66+
# provide application config via env variables
67+
env:
68+
- name: SPECIAL_LEVEL_KEY
69+
valueFrom:
70+
configMapKeyRef:
71+
name: special-config
72+
key: special.how
73+
envFrom:
74+
- configMapRef:
75+
name: etcd-env-config
76+
```
77+
78+
### Subscription Status Changes
79+
80+
The subscription status should reflect whether or not the configuration was successfully applied to the operator pods.
81+
82+
New status conditions (abnormal-true):
83+
84+
```yaml
85+
conditions:
86+
- message: "SPECIAL_LEVEL_KEY" couldn't be applied...
87+
reason: EnvFailure
88+
status: True
89+
type: ConfigFailure
90+
- message: No operator pods found matching selector.
91+
reason: NoMatchingPods
92+
status: True
93+
type: PodConfigSelectorFailure
94+
```
95+
96+
Reasons for `ConfigFailure`
97+
98+
- `EnvFailure`
99+
- `EnvFromFailure`
100+
- `VolumeFailure`
101+
- `VolumeMountFailure`
102+
- `TolerationFailure`
103+
- `NodeSelectorFailure`
104+
- `ResourceRequestFailure`
105+
- `ResourceLimitFailure`
106+
107+
### Implementation
108+
109+
#### Subscription Spec and Status
110+
111+
Spec and Status need to be updated to include the fields described above, and the openapi validation should be updated as well.
112+
113+
#### Control Loops
114+
115+
#### Install Strategy
116+
117+
Most of the change will take place in the install strategy; which knows how to take the deployment spec defined in a CSV and check if the cluster is up-to-date, and apply changes if needed.
118+
119+
- The install strategy will now need to accept the additional configuration from the subscription.
120+
- `CheckInstalled` will need to combine the additional config with the deployment spec from the ClusterServiceVersion.
121+
- Subscription config will overwrite the settings on the CSV.
122+
- `Install` will also need to combien the additional config with the deployment spec.
123+
124+
Appropriate errors should be returned so that we can construct the status that needed for the subscription config status conditions.
125+
126+
This requires that https://github.com/operator-framework/operator-lifecycle-manager/pull/931 have merged, so that changes to a deployment are persisted to the cluster.
127+
128+
#### Subscription sync
129+
130+
- Subscription sync will now need to use the install strategy for the installed CSV to determine if the config settings have been properly applied.
131+
- If not, appropriate status should be written signalling what is missing, and the associated CSV should be requeued.
132+
- Requeing the CSV will attempt to reconcile the config with the deployment on the cluster.
133+
134+
#### Openshift-specific implementation
135+
136+
On start, OLM (catalog operator) needs to:
137+
138+
- Check if the [openshift proxy config api](https://github.com/openshift/api/blob/master/config/v1/types_proxy.go) is available
139+
- If so, set up watches / informers for the GVK to keep our view of the global proxy config up to date.
140+
141+
When reconciling an operator's `Deployment`:
142+
143+
- If the global proxy object is not set / api doesn't exist, do nothing different.
144+
- If the global proxy object is set and none of `HTTPS_PROXY`, `HTTP_PROXY`, `NO_PROXY` are set on the `Subscription`
145+
- Then set those env vars on the deployment and ensure the deployment on the cluster matches.
146+
- If the global proxy object is set and at least one of `HTTPS_PROXY`, `HTTP_PROXY`, `NO_PROXY` are set on the `Subscription`
147+
- Then do nothing different. Global proxy config has been overridden by a user.
148+
149+
150+
### User Documentation
151+
152+
#### Operator Pod Configuration
153+
154+
For the most part, operators are packaged so that they require no configuration. But there are cases where you may wish to configure certain aspects of an operator's runtime environment and have that configuration persist between operator updates.
155+
156+
Examples of configuration that can be set for operator pods:
157+
158+
- Node Selectors and Tolerations to direct pods to particular nodes
159+
- Pod resource requirements and limits
160+
- Enabling debug logs for an operator via an operator-specific config flag
161+
- Setting or overriding proxy configuration
162+
163+
This configuration is set on the `Subscription` for the operator in an optional `config` block. `config` is an list of configurations that should be applied to operator pods, with each entry applying to the pods in the operator for the pods that match the selector. (Operators may consist of many pods, and configuration may only apply to a subset of them)
164+
165+
```yaml
166+
# ...
167+
config:
168+
- selector:
169+
matchLabels:
170+
app: etcd-operator
171+
resources:
172+
requests:
173+
memory: "64Mi"
174+
cpu: "250m"
175+
limits:
176+
memory: "128Mi"
177+
cpu: "500m"
178+
nodeSelector:
179+
disktype: ssd
180+
tolerations:
181+
- key: "key"
182+
operator: "Equal"
183+
value: "value"
184+
effect: "NoSchedule"
185+
# provide application config via a configmap volume
186+
volumes:
187+
- name: config-volume
188+
configMap:
189+
name: etcd-operator-config
190+
volumeMounts:
191+
- mountPath: /config
192+
name: config-volume
193+
# provide application config via env variables
194+
env:
195+
- name: SPECIAL_LEVEL_KEY
196+
valueFrom:
197+
configMapKeyRef:
198+
name: special-config
199+
key: special.how
200+
envFrom:
201+
- configMapRef:
202+
name: etcd-env-config
203+
```
204+
205+
#### Caveats
206+
207+
**Template labels:** Operator configuration is applied via label selectors, and the labels are matched based on the labels configured on the operator at install time (not that exist at runtime on the cluster).
208+
209+
For example, if a ClusterServiceVersion is defined:
210+
211+
```yaml
212+
kind: ClusterServiceVersion
213+
spec:
214+
install:
215+
spec:
216+
deployments:
217+
- name: prometheus-operator
218+
spec:
219+
template:
220+
metadata:
221+
labels:
222+
k8s-app: prometheus-operator
223+
```
224+
225+
An operator pod may be created with lables:
226+
227+
```yaml
228+
metadata:
229+
labels:
230+
k8s-app: prometheus-operator
231+
olm.cahash: 123
232+
```
233+
234+
Then this `Subscription` config with the selector **will match**:
235+
236+
```yaml
237+
config:
238+
- selector:
239+
matchLabels:
240+
k8s-app: prometheus-operator
241+
```
242+
243+
But this `Subscription` config with the selector **will not match**:
244+
245+
```yaml
246+
config:
247+
- selector:
248+
matchLabels:
249+
olm.cahash: 123
250+
```
251+
252+
Because matching is determined from the pod template and not from the real pod on the cluster. Similarly, the configuration will only apply to pods defined by ClusterServiceVersion that has been installed from the Subscription, and will not apply to any other pods, even if the the pod has a matching label.
253+
254+
**Upgrades:** Operator configuration is persisted between updates to the operator, but only if the updated operator's pods continue to match the defined selector. Operator authors should not remove previously-defined template labels unless they wish to prevent previously-defined config from applying.
255+
256+
#### Scenario: Enable debug logs for an operator
257+
258+
In this scenario, we have an operator that has been written to have a command flag set to enable debug logs: `-v=4`.
259+
260+
The relevent section of the `ClusterServiceVersion` (note the new `$(ARGS)`)
261+
262+
```yaml
263+
kind: ClusterServiceVersion
264+
spec:
265+
install:
266+
spec:
267+
deployments:
268+
- name: prometheus-operator
269+
spec:
270+
template:
271+
metadata:
272+
labels:
273+
k8s-app: prometheus-operator
274+
spec:
275+
containers:
276+
- name: prometheus-operator
277+
args:
278+
- -namespaces=$(NAMESPACES)
279+
- $(ARGS)
280+
env:
281+
- name: NAMESPACES
282+
valueFrom:
283+
fieldRef:
284+
fieldPath: metadata.annotations['olm.targetNamespaces']
285+
```
286+
This can then be configured, optionally, from the subscription:
287+
288+
```yaml
289+
kind: Subscription
290+
metadata:
291+
name: prometheus
292+
spec:
293+
pacakge: prometheus
294+
channel: alpha
295+
config:
296+
- selector:
297+
matchLabels:
298+
k8s-app: prometheus-operator
299+
env:
300+
- name: ARGS
301+
value: "-v=4"
302+
```
303+
304+
The `Deployment` object will then be updated by OLM:
305+
306+
```
307+
kind: Deployment
308+
spec:
309+
template:
310+
metadata:
311+
labels:
312+
k8s-app: prometheus-operator
313+
spec:
314+
containers:
315+
- name: prometheus-operator
316+
args:
317+
- -namespaces=$(NAMESPACES)
318+
- $(ARGS)
319+
env:
320+
- name: NAMESPACES
321+
valueFrom:
322+
fieldRef:
323+
fieldPath: metadata.annotations['olm.targetNamespaces']
324+
- name: ARGS
325+
value: "-v=4"
326+
```
327+
328+
When the operator updates to a newer version, it will still be configured with `-v=4 ` (though it's up to the operator author whether that continues to have the desired effect).
329+
330+
#### Openshift Notes
331+
332+
When running in Openshift, OLM will fill in the config for env vars:
333+
334+
- `HTTP_PROXY`
335+
- `HTTPS_PROXY`
336+
- `NO_PROXY`
337+
338+
if there is a global `ProxyConfig` object defined in the cluster. These are treated as a unit - if one of them is already defined on a `Subscription`, the others will not be changed if the global `ProxyConfig` is changed.

0 commit comments

Comments
 (0)