Skip to content

Commit 4301ed2

Browse files
docs/reference: Add links for node status config parameters
This commit addresses the request in #52516 by adding explicit documentation links to the "Heartbeats" section of the Node Status page. Fixes #52516
1 parent b422583 commit 4301ed2

File tree

1 file changed

+56
-62
lines changed

1 file changed

+56
-62
lines changed

content/en/docs/reference/node/node-status.md

Lines changed: 56 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,6 @@ content_type: reference
33
title: Node Status
44
weight: 80
55
---
6-
<!-- overview -->
7-
86
The status of a [node](/docs/concepts/architecture/nodes/) in Kubernetes is a critical
97
aspect of managing a Kubernetes cluster. In this article, we'll cover the basics of
108
monitoring and maintaining node status to ensure a healthy and stable cluster.
@@ -22,74 +20,70 @@ You can use `kubectl` to view a Node's status and other details:
2220

2321
```shell
2422
kubectl describe node <insert-node-name-here>
25-
```
26-
2723
Each section of the output is described below.
2824

29-
## Addresses
30-
25+
Addresses
3126
The usage of these fields varies depending on your cloud provider or bare metal configuration.
3227

33-
* HostName: The hostname as reported by the node's kernel. Can be overridden via the kubelet
34-
`--hostname-override` parameter.
35-
* ExternalIP: Typically the IP address of the node that is externally routable (available from
36-
outside the cluster).
37-
* InternalIP: Typically the IP address of the node that is routable only within the cluster.
28+
HostName: The hostname as reported by the node's kernel. Can be overridden via the kubelet
29+
  --hostname-override parameter.
3830
39-
## Conditions {#condition}
31+
ExternalIP: Typically the IP address of the node that is externally routable (available from
32+
  outside the cluster).
4033
41-
The `conditions` field describes the status of all `Running` nodes. Examples of conditions include:
34+
InternalIP: Typically the IP address of the node that is routable only within the cluster.
35+
36+
Conditions {#condition}
37+
The conditions field describes the status of all Running nodes. Examples of conditions include:
4238
4339
{{< table caption = "Node conditions, and a description of when each condition applies." >}}
44-
| Node Condition | Description |
40+
| Node Condition       | Description |
4541
|----------------------|-------------|
46-
| `Ready` | `True` if the node is healthy and ready to accept pods, `False` if the node is not healthy and is not accepting pods, and `Unknown` if the node controller has not heard from the node in the last `node-monitor-grace-period` (default is 50 seconds) |
47-
| `DiskPressure` | `True` if pressure exists on the disk size—that is, if the disk capacity is low; otherwise `False` |
48-
| `MemoryPressure` | `True` if pressure exists on the node memory—that is, if the node memory is low; otherwise `False` |
49-
| `PIDPressure` | `True` if pressure exists on the processes—that is, if there are too many processes on the node; otherwise `False` |
50-
| `NetworkUnavailable` | `True` if the network for the node is not correctly configured, otherwise `False` |
42+
| Ready              | True if the node is healthy and ready to accept pods, False if the node is not healthy and is not accepting pods, and Unknown if the node controller has not heard from the node in the last node-monitor-grace-period (default is 50 seconds) |
43+
| DiskPressure       | True if pressure exists on the disk size—that is, if the disk capacity is low; otherwise False |
44+
| MemoryPressure     | True if pressure exists on the node memory—that is, if the node memory is low; otherwise False |
45+
| PIDPressure        | True if pressure exists on the processes—that is, if there are too many processes on the node; otherwise False |
46+
| NetworkUnavailable | True if the network for the node is not correctly configured, otherwise False |
5147
{{< /table >}}
5248
5349
{{< note >}}
5450
If you use command-line tools to print details of a cordoned Node, the Condition includes
55-
`SchedulingDisabled`. `SchedulingDisabled` is not a Condition in the Kubernetes API; instead,
51+
SchedulingDisabled. SchedulingDisabled is not a Condition in the Kubernetes API; instead,
5652
cordoned nodes are marked Unschedulable in their spec.
5753
{{< /note >}}
5854
59-
In the Kubernetes API, a node's condition is represented as part of the `.status`
55+
In the Kubernetes API, a node's condition is represented as part of the .status
6056
of the Node resource. For example, the following JSON structure describes a healthy node:
6157

62-
```json
58+
JSON
59+
6360
"conditions": [
64-
{
65-
"type": "Ready",
66-
"status": "True",
67-
"reason": "KubeletReady",
68-
"message": "kubelet is posting ready status",
69-
"lastHeartbeatTime": "2019-06-05T18:38:35Z",
70-
"lastTransitionTime": "2019-06-05T11:41:27Z"
71-
}
61+
  {
62+
    "type": "Ready",
63+
    "status": "True",
64+
    "reason": "KubeletReady",
65+
    "message": "kubelet is posting ready status",
66+
    "lastHeartbeatTime": "2019-06-05T18:38:35Z",
67+
    "lastTransitionTime": "2019-06-05T11:41:27Z"
68+
  }
7269
]
73-
```
74-
7570
When problems occur on nodes, the Kubernetes control plane automatically creates
76-
[taints](/docs/concepts/scheduling-eviction/taint-and-toleration/) that match the conditions
77-
affecting the node. An example of this is when the `status` of the Ready condition
78-
remains `Unknown` or `False` for longer than the kube-controller-manager's `NodeMonitorGracePeriod`,
79-
which defaults to 50 seconds. This will cause either an `node.kubernetes.io/unreachable` taint, for an `Unknown` status,
80-
or a `node.kubernetes.io/not-ready` taint, for a `False` status, to be added to the Node.
71+
taints that match the conditions
72+
affecting the node. An example of this is when the status of the Ready condition
73+
remains Unknown or False for longer than the kube-controller-manager's NodeMonitorGracePeriod,
74+
which defaults to 50 seconds. This will cause either an node.kubernetes.io/unreachable taint, for an Unknown status,
75+
or a node.kubernetes.io/not-ready taint, for a False status, to be added to the Node.
8176
8277
These taints affect pending pods as the scheduler takes the Node's taints into consideration when
8378
assigning a pod to a Node. Existing pods scheduled to the node may be evicted due to the application
84-
of `NoExecute` taints. Pods may also have {{< glossary_tooltip text="tolerations" term_id="toleration" >}} that let
79+
of NoExecute taints. Pods may also have {{< glossary_tooltip text="tolerations" term_id="toleration" >}} that let
8580
them schedule to and continue running on a Node even though it has a specific taint.
8681

87-
See [Taint Based Evictions](/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-based-evictions) and
88-
[Taint Nodes by Condition](/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-nodes-by-condition)
82+
See Taint Based Evictions and
83+
Taint Nodes by Condition
8984
for more details.
9085

91-
## Capacity and Allocatable {#capacity}
92-
86+
Capacity and Allocatable {#capacity}
9387
Describes the resources available on the node: CPU, memory, and the maximum
9488
number of pods that can be scheduled onto the node.
9589

@@ -98,42 +92,42 @@ Node has. The allocatable block indicates the amount of resources on a
9892
Node that is available to be consumed by normal Pods.
9993

10094
You may read more about capacity and allocatable resources while learning how
101-
to [reserve compute resources](/docs/tasks/administer-cluster/reserve-compute-resources/#node-allocatable)
95+
to reserve compute resources
10296
on a Node.
10397

104-
## Info
105-
98+
Info
10699
Describes general information about the node, such as kernel version, Kubernetes
107100
version (kubelet and kube-proxy version), container runtime details, and which
108101
operating system the node uses.
109102
The kubelet gathers this information from the node and publishes it into
110103
the Kubernetes API.
111104

112-
## Heartbeats
113-
105+
Heartbeats
114106
Heartbeats, sent by Kubernetes nodes, help your cluster determine the
115107
availability of each node, and to take action when failures are detected.
116108

117109
For nodes there are two forms of heartbeats:
118110

119-
* updates to the `.status` of a Node
120-
* [Lease](/docs/concepts/architecture/leases/) objects
121-
within the `kube-node-lease`
122-
{{< glossary_tooltip term_id="namespace" text="namespace">}}.
123-
Each Node has an associated Lease object.
111+
updates to the .status of a Node
124112

125-
Compared to updates to `.status` of a Node, a Lease is a lightweight resource.
113+
Lease objects
114+
  within the kube-node-lease
115+
  {{< glossary_tooltip term_id="namespace" text="namespace">}}.
116+
  Each Node has an associated Lease object.
117+
118+
Compared to updates to .status of a Node, a Lease is a lightweight resource.
126119
Using Leases for heartbeats reduces the performance impact of these updates
127120
for large clusters.
128121

129-
The kubelet is responsible for creating and updating the `.status` of Nodes,
122+
The kubelet is responsible for creating and updating the .status of Nodes,
130123
and for updating their related Leases.
131124

132-
- The kubelet updates the node's `.status` either when there is change in status
133-
or if there has been no update for a configured interval. The default interval
134-
for `.status` updates to Nodes is 5 minutes, which is much longer than the 40
135-
second default timeout for unreachable nodes.
136-
- The kubelet creates and then updates its Lease object every 10 seconds
137-
(the default update interval). Lease updates occur independently from
138-
updates to the Node's `.status`. If the Lease update fails, the kubelet retries,
139-
using exponential backoff that starts at 200 milliseconds and capped at 7 seconds.
125+
The kubelet updates the node's .status either when there is change in status
126+
  or if there has been no update for a configured interval. The default interval
127+
  for .status updates to Nodes is 5 minutes, which is much longer than the 40
128+
  second default timeout for unreachable nodes. The update interval is controlled by the nodeStatusUpdateFrequency field in the Kubelet configuration file, and the timeout is controlled by the --node-monitor-grace-period flag on the kube-controller-manager.
129+
130+
The kubelet creates and then updates its Lease object every 10 seconds
131+
  (the default update interval). Lease updates occur independently from
132+
  updates to the Node's .status. If the Lease update fails, the kubelet retries,
133+
  using exponential backoff that starts at 200 milliseconds and capped at 7 seconds.

0 commit comments

Comments
 (0)