Skip to content

Commit 19de2fe

Browse files
authored
Merge pull request #1030 from oracle/graceful-shutdown
Graceful shutdown
2 parents 1ca0766 + 11e7b52 commit 19de2fe

File tree

53 files changed

+1175
-503
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+1175
-503
lines changed

docs-source/content/userguide/managing-domains/domain-lifecycle/startup.md

Lines changed: 155 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -11,13 +11,16 @@ and which servers should be restarted. To start, stop, or restart servers, modif
1111

1212
* [Starting and stopping servers](#starting-and-stopping-servers)
1313
* [Common starting and stopping scenarios](#common-starting-and-stopping-scenarios)
14+
* [Shutdown options](#shutdown-options)
1415
* [Restarting servers](#restarting-servers)
1516
* [Rolling restarts](#rolling-restarts)
1617
* [Common restarting scenarios](#common-restarting-scenarios)
1718

1819
There are properties on the domain resource that specify which servers should be running
1920
and which servers should be restarted. To start, stop, or restart servers, modify these properties on the domain resource
20-
(for example, by using `kubectl` or the Kubernetes REST API). The operator will notice the changes and apply them.
21+
(for example, by using `kubectl` or the Kubernetes REST API). The operator will notice the changes and apply them. Beginning,
22+
with operator version 2.2, there are now properties to control server shutdown handling, such as whether the shutdown
23+
will be graceful, the timeout, and if in-flight sessions are given the opportunity to complete.
2124

2225
### Starting and stopping servers
2326

@@ -74,50 +77,60 @@ In this case, the domain resource does not need to specify `serverStartPolicy`,
7477

7578
For example:
7679
```
77-
domain:
78-
spec:
79-
image: ...
80-
replicas: 10
80+
kind: Domain
81+
metadata:
82+
name: domain1
83+
spec:
84+
image: ...
85+
replicas: 10
8186
```
8287

8388
#### Shut down all the servers
8489
Sometimes you need to completely shut down the domain (for example, take it out of service).
8590
```
86-
domain:
87-
spec:
88-
serverStartPolicy: "NEVER"
89-
...
91+
kind: Domain
92+
metadata:
93+
name: domain1
94+
spec:
95+
serverStartPolicy: "NEVER"
96+
...
9097
```
9198

9299
#### Only start the Administration Server
93100
Sometimes you want to start the Administration Server only, that is, take the domain out of service but leave the Administration Server running so that you can administer the domain.
94101
```
95-
domain:
96-
spec:
97-
serverStartPolicy: "ADMIN_ONLY"
98-
...
102+
kind: Domain
103+
metadata:
104+
name: domain1
105+
spec:
106+
serverStartPolicy: "ADMIN_ONLY"
107+
...
99108
```
100109

101110
#### Shut down a cluster
102111
To shut down a cluster (for example, take it out of service), add it to the domain resource and set its `serverStartPolicy` to `NEVER`.
103112
```
104-
domain:
105-
spec:
106-
clusters:
107-
- clusterName: "cluster1"
108-
serverStartPolicy: "NEVER"
109-
...
113+
kind: Domain
114+
metadata:
115+
name: domain1
116+
spec:
117+
clusters:
118+
- clusterName: "cluster1"
119+
serverStartPolicy: "NEVER"
120+
...
110121
```
111122

112123
#### Shut down a specific standalone server
113124
To shut down a specific standalone server, add it to the domain resource and set its `serverStartPolicy` to `NEVER`.
114125
```
115-
domain:
116-
spec:
117-
managedServers:
118-
- serverName: "server1"
119-
serverStartPolicy: "NEVER"
120-
...
126+
kind: Domain
127+
metadata:
128+
name: domain1
129+
spec:
130+
managedServers:
131+
- serverName: "server1"
132+
serverStartPolicy: "NEVER"
133+
...
121134
```
122135

123136
#### Force a specific clustered Managed Server to start
@@ -126,18 +139,81 @@ However, sometimes some of the Managed Servers are different (for example, suppo
126139

127140
This is done by adding the server to the domain resource and setting its `serverStartPolicy` to `ALWAYS`.
128141
```
129-
domain:
130-
spec:
131-
managedServers:
132-
- serverName: "cluster1_server1"
133-
serverStartPolicy: "ALWAYS"
134-
...
142+
kind: Domain
143+
metadata:
144+
name: domain1
145+
spec:
146+
managedServers:
147+
- serverName: "cluster1_server1"
148+
serverStartPolicy: "ALWAYS"
149+
...
135150
```
136151

137152
{{% notice note %}}
138153
The server will count toward the cluster's `replicas` count. Also, if you configure more than the `replicas` servers count to `ALWAYS`, they will all be started, even though the `replicas` count will be exceeded.
139154
{{%/ notice %}}
140155

156+
### Shutdown options
157+
158+
The domain resource includes the element `serverPod` that is available under `spec`, `adminServer` and each entry of
159+
`clusters` and `managedServers`. The `serverPod` element controls many details of how pods are created for server instances.
160+
161+
The `shutdown` element of `serverPod` controls how servers will be shutdown. This element has three properties:
162+
`shutdownType`, `timeoutSeconds`, and `ignoreSessions`. The `shutdownType` property can be set to either `Graceful`, the default,
163+
or `Forced` specifying the type of shutdown. The `timeoutSeconds` property configures how long the server is given to
164+
complete shutdown before the server is killed. The `ignoreSessions` property, which is only applicable for graceful shutdown, when `false`,
165+
the default, allows the shutdown process to take longer to give time for any active sessions to complete up to the configured timeout.
166+
The operator runtime monitors this property but will not restart any server pods solely to adjust the shutdown options.
167+
Instead, server pods created or restarted because of another property change will be configured to shutdown, at the appropriate
168+
time, using the shutdown options set when the server pod is created.
169+
170+
#### Shutdown environment variables
171+
172+
The operator runtime configures shutdown behavior with the use of the following environment variables. Users may
173+
instead simply configure these environment variables directly. When a user-configured environment variable is present,
174+
the operator will not override the environment variable based on the shutdown configuration.
175+
176+
| Environment Variables | Default Value | Supported Values |
177+
| --- | --- | --- |
178+
| `SHUTDOWN_TYPE` | `Graceful` | `Graceful` or `Forced` |
179+
| `SHUTDOWN_TIMEOUT` | 30 | Whole number in seconds where 0 means no timeout |
180+
| `SHUTDOWN_IGNORE_SESSIONS` | `false` | Boolean indicating if active sessions should be ignored; only applicable if shutdown is graceful |
181+
182+
#### `shutdown` rules
183+
184+
You can specify the `serverPod` element, including the `shutdown` element, at the domain, cluster, and server levels. If
185+
`shutdown` is specified at multiple levels, such as for a cluster and for a member server that is part of that cluster,
186+
then the shutdown configuration for a specific server is the combination of all of the relevant values with each field
187+
having the value from the `shutdown` element at the most specific scope.
188+
189+
For instance, given the following domain resource:
190+
```
191+
kind: Domain
192+
metadata:
193+
name: domain1
194+
spec:
195+
serverPod:
196+
shutdown:
197+
shutdownType: Graceful
198+
timeoutSeconds: 45
199+
clusters:
200+
- clusterName: "cluster1"
201+
serverPod:
202+
shutdown:
203+
ignoreSessions: true
204+
managedServers:
205+
- serverName: "cluster1_server1"
206+
serverPod:
207+
shutdown:
208+
timeoutSeconds: 60
209+
ignoreSessions: false
210+
...
211+
```
212+
213+
Graceful shutdown is used for all servers in the domain because this is specified at the domain level and is not overridden at
214+
any cluster or server level. The "cluster1" cluster defaults to ignoring sessions; however, the "cluster1_server1" server
215+
instance will not ignore sessions and will have a longer timeout.
216+
141217
### Restarting servers
142218

143219
The operator runtime automatically recreates (restarts) server pods when properties on the domain resource that affect server pods change (such as `image`, `volumes`, and `env`).
@@ -211,51 +287,59 @@ The servers will also be restarted if `restartVersion` is removed from the domai
211287
Set `restartVersion` at the domain level to a new value.
212288

213289
```
214-
domain:
215-
spec:
216-
restartVersion: "domainV1"
217-
...
290+
kind: Domain
291+
metadata:
292+
name: domain1
293+
spec:
294+
restartVersion: "domainV1"
295+
...
218296
```
219297

220298
#### Restart all the servers in the cluster
221299

222300
Set `restartVersion` at the cluster level to a new value.
223301

224302
```
225-
domain:
226-
spec:
227-
clusters:
228-
- clusterName : "cluster1"
229-
restartVersion: "cluster1V1"
230-
maxUnavailable: 2
231-
...
303+
kind: Domain
304+
metadata:
305+
name: domain1
306+
spec:
307+
clusters:
308+
- clusterName : "cluster1"
309+
restartVersion: "cluster1V1"
310+
maxUnavailable: 2
311+
...
232312
```
233313

234314
#### Restart the Administration Server
235315

236316
Set `restartVersion` at the `adminServer` level to a new value.
237317

238318
```
239-
domain:
240-
spec:
241-
adminServer:
242-
restartVersion: "adminV1"
243-
...
319+
kind: Domain
320+
metadata:
321+
name: domain1
322+
spec:
323+
adminServer:
324+
restartVersion: "adminV1"
325+
...
244326
```
245327

246328
#### Restart a standalone or clustered Managed Server
247329

248330
Set `restartVersion` at the `managedServer` level to a new value.
249331

250332
```
251-
domain:
252-
spec:
253-
managedServers:
254-
- serverName: "standalone_server1"
255-
restartVersion: "v1"
256-
- serverName: "cluster1_server1"
257-
restartVersion: "v1"
258-
...
333+
kind: Domain
334+
metadata:
335+
name: domain1
336+
spec:
337+
managedServers:
338+
- serverName: "standalone_server1"
339+
restartVersion: "v1"
340+
- serverName: "cluster1_server1"
341+
restartVersion: "v1"
342+
...
259343
```
260344
#### Full domain restarts
261345

@@ -265,23 +349,27 @@ then restart them. Unlike rolling restarts, the operator cannot detect and init
265349
To manually initiate a full domain restart:
266350

267351
1. Change the domain level `serverStartPolicy` on the domain resource to `NEVER`.
268-
```
269-
domain:
270-
spec:
271-
serverStartPolicy: "NEVER"
272-
...
273-
```
352+
```
353+
kind: Domain
354+
metadata:
355+
name: domain1
356+
spec:
357+
serverStartPolicy: "NEVER"
358+
...
359+
```
274360

275361
2. Wait for the operator to stop ALL the servers for that domain.
276362

277363
3. To restart the domain, set the domain level `serverStartPolicy` back to `IF_NEEDED`. Alternatively, you do not
278364
have to specify the `serverStartPolicy` as the default value is `IF_NEEDED`.
279365

280-
```
281-
domain:
282-
spec:
283-
serverStartPolicy: "IF_NEEDED"
284-
...
285-
```
366+
```
367+
kind: Domain
368+
metadata:
369+
name: domain1
370+
spec:
371+
serverStartPolicy: "IF_NEEDED"
372+
...
373+
```
286374

287375
4. The operator will restart all the servers in the domain.

docs/domains/Domain.json

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -493,6 +493,10 @@
493493
"items": {
494494
"$ref": "https://github.com/garethr/kubernetes-json-schema/blob/master/v1.9.0/_definitions.json#/definitions/io.k8s.api.core.v1.Container"
495495
}
496+
},
497+
"shutdown": {
498+
"description": "Configures how the operator should shutdown the server instance.",
499+
"$ref": "#/definitions/Shutdown"
496500
}
497501
}
498502
},
@@ -521,6 +525,28 @@
521525
}
522526
}
523527
},
528+
"Shutdown": {
529+
"description": "Shutdown describes the configuration for shutting down a server instance.",
530+
"type": "object",
531+
"properties": {
532+
"ignoreSessions": {
533+
"description": "For graceful shutdown only, indicates to ignore pending HTTP sessions during in-flight work handling. Not required. Defaults to false.",
534+
"type": "boolean"
535+
},
536+
"shutdownType": {
537+
"description": "Tells the operator how to shutdown server instances. Not required. Defaults to graceful shutdown.",
538+
"type": "string",
539+
"enum": [
540+
"Graceful",
541+
"Forced"
542+
]
543+
},
544+
"timeoutSeconds": {
545+
"description": "For graceful shutdown only, number of seconds to wait before aborting in-flight work and shutting down the server. Not required. Defaults to 30 seconds.",
546+
"type": "number"
547+
}
548+
}
549+
},
524550
"SubsystemHealth": {
525551
"type": "object",
526552
"properties": {

docs/domains/Domain.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,7 @@ ServerPod describes the configuration for a Kubernetes pod for a server.
110110
| `podSecurityContext` | [Pod Security Context](k8s1.9.0.md#pod-security-context) | Pod-level security attributes. |
111111
| `readinessProbe` | [Probe Tuning](#probe-tuning) | Settings for the readiness probe associated with a server. |
112112
| `resources` | [Resource Requirements](k8s1.9.0.md#resource-requirements) | Memory and cpu minimum requirements and limits for the server. |
113+
| `shutdown` | [Shutdown](#shutdown) | Configures how the operator should shutdown the server instance. |
113114
| `volumeMounts` | array of [Volume Mount](k8s1.9.0.md#volume-mount) | Additional volume mounts for the server pod. |
114115
| `volumes` | array of [Volume](k8s1.9.0.md#volume) | Additional volumes to be created in the server pod. |
115116

@@ -157,6 +158,16 @@ ServerPod describes the configuration for a Kubernetes pod for a server.
157158
| `periodSeconds` | number | The number of seconds between checks |
158159
| `timeoutSeconds` | number | The number of seconds with no response that indicates a failure |
159160

161+
### Shutdown
162+
163+
Shutdown describes the configuration for shutting down a server instance.
164+
165+
| Name | Type | Description |
166+
| --- | --- | --- |
167+
| `ignoreSessions` | Boolean | For graceful shutdown only, indicates to ignore pending HTTP sessions during in-flight work handling. Not required. Defaults to false. |
168+
| `shutdownType` | string | Tells the operator how to shutdown server instances. Not required. Defaults to graceful shutdown. |
169+
| `timeoutSeconds` | number | For graceful shutdown only, number of seconds to wait before aborting in-flight work and shutting down the server. Not required. Defaults to 30 seconds. |
170+
160171
### Server Health
161172

162173
| Name | Type | Description |

0 commit comments

Comments
 (0)