Skip to content

Commit 720812a

Browse files
authored
Merge pull request #23755 from sftim/20200908_revise_task_safely_drain_node
Revise node draining task page
2 parents c7bc044 + b838ea7 commit 720812a

File tree

1 file changed

+39
-40
lines changed

1 file changed

+39
-40
lines changed

content/en/docs/tasks/administer-cluster/safely-drain-node.md

Lines changed: 39 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -6,23 +6,21 @@ reviewers:
66
- kow3ns
77
title: Safely Drain a Node while Respecting the PodDisruptionBudget
88
content_type: task
9+
min-kubernetes-server-version: 1.5
910
---
1011

1112
<!-- overview -->
12-
This page shows how to safely drain a node, respecting the PodDisruptionBudget you have defined.
13-
13+
This page shows how to safely drain a {{< glossary_tooltip text="node" term_id="node" >}},
14+
respecting the PodDisruptionBudget you have defined.
1415

1516
## {{% heading "prerequisites" %}}
1617

17-
18-
This task assumes that you have met the following prerequisites:
19-
20-
* You are using Kubernetes release >= 1.5.
21-
* Either:
18+
{{% version-check %}}
19+
This task also assumes that you have met the following prerequisites:
2220
1. You do not require your applications to be highly available during the
2321
node drain, or
24-
1. You have read about the [PodDisruptionBudget concept](/docs/concepts/workloads/pods/disruptions/)
25-
and [Configured PodDisruptionBudgets](/docs/tasks/run-application/configure-pdb/) for
22+
1. You have read about the [PodDisruptionBudget](/docs/concepts/workloads/pods/disruptions/) concept,
23+
and have [configured PodDisruptionBudgets](/docs/tasks/run-application/configure-pdb/) for
2624
applications that need them.
2725

2826

@@ -35,10 +33,10 @@ You can use `kubectl drain` to safely evict all of your pods from a
3533
node before you perform maintenance on the node (e.g. kernel upgrade,
3634
hardware maintenance, etc.). Safe evictions allow the pod's containers
3735
to [gracefully terminate](/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination)
38-
and will respect the `PodDisruptionBudgets` you have specified.
36+
and will respect the PodDisruptionBudgets you have specified.
3937

4038
{{< note >}}
41-
By default `kubectl drain` will ignore certain system pods on the node
39+
By default `kubectl drain` ignores certain system pods on the node
4240
that cannot be killed; see
4341
the [kubectl drain](/docs/reference/generated/kubectl/kubectl-commands/#drain)
4442
documentation for more details.
@@ -78,29 +76,29 @@ The `kubectl drain` command should only be issued to a single node at a
7876
time. However, you can run multiple `kubectl drain` commands for
7977
different nodes in parallel, in different terminals or in the
8078
background. Multiple drain commands running concurrently will still
81-
respect the `PodDisruptionBudget` you specify.
79+
respect the PodDisruptionBudget you specify.
8280

8381
For example, if you have a StatefulSet with three replicas and have
84-
set a `PodDisruptionBudget` for that set specifying `minAvailable:
85-
2`. `kubectl drain` will only evict a pod from the StatefulSet if all
86-
three pods are ready, and if you issue multiple drain commands in
87-
parallel, Kubernetes will respect the PodDisruptionBudget and ensure
88-
that only one pod is unavailable at any given time. Any drains that
89-
would cause the number of ready replicas to fall below the specified
90-
budget are blocked.
82+
set a PodDisruptionBudget for that set specifying `minAvailable: 2`,
83+
`kubectl drain` only evicts a pod from the StatefulSet if all three
84+
replicas pods are ready; if then you issue multiple drain commands in
85+
parallel, Kubernetes respects the PodDisruptionBudget and ensure
86+
that only 1 (calculated as `replicas - minAvailable`) Pod is unavailable
87+
at any given time. Any drains that would cause the number of ready
88+
replicas to fall below the specified budget are blocked.
9189

92-
## The Eviction API
90+
## The Eviction API {#eviction-api}
9391

9492
If you prefer not to use [kubectl drain](/docs/reference/generated/kubectl/kubectl-commands/#drain) (such as
9593
to avoid calling to an external command, or to get finer control over the pod
9694
eviction process), you can also programmatically cause evictions using the eviction API.
9795

98-
You should first be familiar with using [Kubernetes language clients](/docs/tasks/administer-cluster/access-cluster-api/#programmatic-access-to-the-api).
96+
You should first be familiar with using [Kubernetes language clients](/docs/tasks/administer-cluster/access-cluster-api/#programmatic-access-to-the-api) to access the API.
9997

10098
The eviction subresource of a
101-
pod can be thought of as a kind of policy-controlled DELETE operation on the pod
102-
itself. To attempt an eviction (perhaps more REST-precisely, to attempt to
103-
*create* an eviction), you POST an attempted operation. Here's an example:
99+
Pod can be thought of as a kind of policy-controlled DELETE operation on the Pod
100+
itself. To attempt an eviction (more precisely: to attempt to
101+
*create* an Eviction), you POST an attempted operation. Here's an example:
104102

105103
```json
106104
{
@@ -116,21 +114,19 @@ itself. To attempt an eviction (perhaps more REST-precisely, to attempt to
116114
You can attempt an eviction using `curl`:
117115

118116
```bash
119-
curl -v -H 'Content-type: application/json' http://127.0.0.1:8080/api/v1/namespaces/default/pods/quux/eviction -d @eviction.json
117+
curl -v -H 'Content-type: application/json' https://your-cluster-api-endpoint.example/api/v1/namespaces/default/pods/quux/eviction -d @eviction.json
120118
```
121119

122120
The API can respond in one of three ways:
123121

124-
- If the eviction is granted, then the pod is deleted just as if you had sent
125-
a `DELETE` request to the pod's URL and you get back `200 OK`.
122+
- If the eviction is granted, then the Pod is deleted just as if you had sent
123+
a `DELETE` request to the Pod's URL and you get back `200 OK`.
126124
- If the current state of affairs wouldn't allow an eviction by the rules set
127125
forth in the budget, you get back `429 Too Many Requests`. This is
128126
typically used for generic rate limiting of *any* requests, but here we mean
129127
that this request isn't allowed *right now* but it may be allowed later.
130-
Currently, callers do not get any `Retry-After` advice, but they may in
131-
future versions.
132-
- If there is some kind of misconfiguration, like multiple budgets pointing at
133-
the same pod, you will get `500 Internal Server Error`.
128+
- If there is some kind of misconfiguration; for example multiple PodDisruptionBudgets
129+
that refer the same Pod, you get a `500 Internal Server Error` response.
134130

135131
For a given eviction request, there are two cases:
136132

@@ -139,21 +135,25 @@ For a given eviction request, there are two cases:
139135
- There is at least one budget. In this case, any of the three above responses may
140136
apply.
141137

142-
In some cases, an application may reach a broken state where it will never return anything
143-
other than 429 or 500. This can happen, for example, if the replacement pod created by the
144-
application's controller does not become ready, or if the last pod evicted has a very long
145-
termination grace period.
138+
## Stuck evictions
139+
140+
In some cases, an application may reach a broken state, one where unless you intervene the
141+
eviction API will never return anything other than 429 or 500.
142+
143+
For example: this can happen if ReplicaSet is creating Pods for your application but
144+
the replacement Pods do not become `Ready`. You can also see similar symptoms if the
145+
last Pod evicted has a very long termination grace period.
146146

147147
In this case, there are two potential solutions:
148148

149-
- Abort or pause the automated operation. Investigate the reason for the stuck application, and restart the automation.
150-
- After a suitably long wait, `DELETE` the pod instead of using the eviction API.
149+
- Abort or pause the automated operation. Investigate the reason for the stuck application,
150+
and restart the automation.
151+
- After a suitably long wait, `DELETE` the Pod from your cluster's control plane, instead
152+
of using the eviction API.
151153

152154
Kubernetes does not specify what the behavior should be in this case; it is up to the
153155
application owners and cluster owners to establish an agreement on behavior in these cases.
154156

155-
156-
157157
## {{% heading "whatsnext" %}}
158158

159159

@@ -162,4 +162,3 @@ application owners and cluster owners to establish an agreement on behavior in t
162162

163163

164164

165-

0 commit comments

Comments
 (0)