Skip to content

Commit 5638f85

Browse files
authored
Merge pull request #39392 from arrikto/feature-ds-schedule
Update DaemonSet guide
2 parents 01f524f + d0b3ba5 commit 5638f85

File tree

1 file changed

+52
-43
lines changed

1 file changed

+52
-43
lines changed

content/en/docs/concepts/workloads/controllers/daemonset.md

Lines changed: 52 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -105,30 +105,24 @@ If you do not specify either, then the DaemonSet controller will create Pods on
105105

106106
## How Daemon Pods are scheduled
107107

108-
### Scheduled by default scheduler
109-
110-
{{< feature-state for_k8s_version="1.17" state="stable" >}}
111-
112-
A DaemonSet ensures that all eligible nodes run a copy of a Pod. Normally, the
113-
node that a Pod runs on is selected by the Kubernetes scheduler. However,
114-
DaemonSet pods are created and scheduled by the DaemonSet controller instead.
115-
That introduces the following issues:
116-
117-
* Inconsistent Pod behavior: Normal Pods waiting to be scheduled are created
118-
and in `Pending` state, but DaemonSet pods are not created in `Pending`
119-
state. This is confusing to the user.
120-
* [Pod preemption](/docs/concepts/scheduling-eviction/pod-priority-preemption/)
121-
is handled by default scheduler. When preemption is enabled, the DaemonSet controller
122-
will make scheduling decisions without considering pod priority and preemption.
123-
124-
`ScheduleDaemonSetPods` allows you to schedule DaemonSets using the default
125-
scheduler instead of the DaemonSet controller, by adding the `NodeAffinity` term
126-
to the DaemonSet pods, instead of the `.spec.nodeName` term. The default
127-
scheduler is then used to bind the pod to the target host. If node affinity of
128-
the DaemonSet pod already exists, it is replaced (the original node affinity was
129-
taken into account before selecting the target host). The DaemonSet controller only
130-
performs these operations when creating or modifying DaemonSet pods, and no
131-
changes are made to the `spec.template` of the DaemonSet.
108+
A DaemonSet ensures that all eligible nodes run a copy of a Pod. The DaemonSet
109+
controller creates a Pod for each eligible node and adds the
110+
`spec.affinity.nodeAffinity` field of the Pod to match the target host. After
111+
the Pod is created, the default scheduler typically takes over and then binds
112+
the Pod to the target host by setting the `.spec.nodeName` field. If the new
113+
Pod cannot fit on the node, the default scheduler may preempt (evict) some of
114+
the existing Pods based on the
115+
[priority](/docs/concepts/scheduling-eviction/pod-priority-preemption/#pod-priority)
116+
of the new Pod.
117+
118+
The user can specify a different scheduler for the Pods of the DamonSet, by
119+
setting the `.spec.template.spec.schedulerName` field of the DaemonSet.
120+
121+
The original node affinity specified at the
122+
`.spec.template.spec.affinity.nodeAffinity` field (if specified) is taken into
123+
consideration by the DaemonSet controller when evaluating the eligible nodes,
124+
but is replaced on the created Pod with the node affinity that matches the name
125+
of the eligible node.
132126

133127
```yaml
134128
nodeAffinity:
@@ -141,25 +135,40 @@ nodeAffinity:
141135
- target-host-name
142136
```
143137
144-
In addition, `node.kubernetes.io/unschedulable:NoSchedule` toleration is added
145-
automatically to DaemonSet Pods. The default scheduler ignores
146-
`unschedulable` Nodes when scheduling DaemonSet Pods.
147-
148-
### Taints and Tolerations
149-
150-
Although Daemon Pods respect
151-
[taints and tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/),
152-
the following tolerations are added to DaemonSet Pods automatically according to
153-
the related features.
154-
155-
| Toleration Key | Effect | Version | Description |
156-
| ---------------------------------------- | ---------- | ------- | ----------- |
157-
| `node.kubernetes.io/not-ready` | NoExecute | 1.13+ | DaemonSet pods will not be evicted when there are node problems such as a network partition. |
158-
| `node.kubernetes.io/unreachable` | NoExecute | 1.13+ | DaemonSet pods will not be evicted when there are node problems such as a network partition. |
159-
| `node.kubernetes.io/disk-pressure` | NoSchedule | 1.8+ | DaemonSet pods tolerate disk-pressure attributes by default scheduler. |
160-
| `node.kubernetes.io/memory-pressure` | NoSchedule | 1.8+ | DaemonSet pods tolerate memory-pressure attributes by default scheduler. |
161-
| `node.kubernetes.io/unschedulable` | NoSchedule | 1.12+ | DaemonSet pods tolerate unschedulable attributes by default scheduler. |
162-
| `node.kubernetes.io/network-unavailable` | NoSchedule | 1.12+ | DaemonSet pods, who uses host network, tolerate network-unavailable attributes by default scheduler. |
138+
139+
### Taints and tolerations
140+
141+
The DaemonSet controller automatically adds a set of {{< glossary_tooltip
142+
text="tolerations" term_id="toleration" >}} to DaemonSet Pods:
143+
144+
{{< table caption="Tolerations for DaemonSet pods" >}}
145+
146+
| Toleration key | Effect | Details |
147+
| --------------------------------------------------------------------------------------------------------------------- | ------------ | --------------------------------------------------------------------------------------------------------------------------------------------- |
148+
| [`node.kubernetes.io/not-ready`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-not-ready) | `NoExecute` | DaemonSet Pods can be scheduled onto nodes that are not healthy or ready to accept Pods. Any DaemonSet Pods running on such nodes will not be evicted. |
149+
| [`node.kubernetes.io/unreachable`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-unreachable) | `NoExecute` | DaemonSet Pods can be scheduled onto nodes that are unreachable from the node controller. Any DaemonSet Pods running on such nodes will not be evicted. |
150+
| [`node.kubernetes.io/disk-pressure`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-disk-pressure) | `NoSchedule` | DaemonSet Pods can be scheduled onto nodes with disk pressure issues. |
151+
| [`node.kubernetes.io/memory-pressure`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-memory-pressure) | `NoSchedule` | DaemonSet Pods can be scheduled onto nodes with memory pressure issues. |
152+
| [`node.kubernetes.io/pid-pressure`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-pid-pressure) | `NoSchedule` | DaemonSet Pods can be scheduled onto nodes with process pressure issues. |
153+
| [`node.kubernetes.io/unschedulable`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-unschedulable) | `NoSchedule` | DaemonSet Pods can be scheduled onto nodes that are unschedulable. |
154+
| [`node.kubernetes.io/network-unavailable`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-network-unavailable) | `NoSchedule` | **Only added for DaemonSet Pods that request host networking**, i.e., Pods having `spec.hostNetwork: true`. Such DaemonSet Pods can be scheduled onto nodes with unavailable network.|
155+
156+
{{< /table >}}
157+
158+
You can add your own tolerations to the Pods of a Daemonset as well, by
159+
defining these in the Pod template of the DaemonSet.
160+
161+
Because the DaemonSet controller sets the
162+
`node.kubernetes.io/unschedulable:NoSchedule` toleration automatically,
163+
Kubernetes can run DaemonSet Pods on nodes that are marked as _unschedulable_.
164+
165+
If you use a DaemonSet to provide an important node-level function, such as
166+
[cluster networking](/docs/concepts/cluster-administration/networking/), it is
167+
helpful that Kubernetes places DaemonSet Pods on nodes before they are ready.
168+
For example, without that special toleration, you could end up in a deadlock
169+
situation where the node is not marked as ready because the network plugin is
170+
not running there, and at the same time the network plugin is not running on
171+
that node because the node is not yet ready.
163172

164173
## Communicating with Daemon Pods
165174

0 commit comments

Comments
 (0)