Skip to content

Commit 0600eae

Browse files
author
Tim Bannister
committed
Reword “Building large clusters”
The existing page had lots of references to specific products. Those references aren't in line with the current content guide, so I cut them out. I then reshaped the page to be a general set of advice about managing and running large clusters.
1 parent 0703805 commit 0600eae

File tree

1 file changed

+71
-90
lines changed

1 file changed

+71
-90
lines changed

content/en/docs/setup/best-practices/cluster-large.md

Lines changed: 71 additions & 90 deletions
Original file line numberDiff line numberDiff line change
@@ -2,126 +2,107 @@
22
reviewers:
33
- davidopp
44
- lavalamp
5-
title: Building large clusters
5+
title: Considerations for large clusters
66
weight: 20
77
---
88

9-
## Support
10-
11-
At {{< param "version" >}}, Kubernetes supports clusters with up to 5000 nodes. More specifically, we support configurations that meet *all* of the following criteria:
9+
A cluster is a set of {{< glossary_tooltip text="nodes" term_id="node" >}} (physical
10+
or virtual machines) running Kubernetes agents, managed by the
11+
{{< glossary_tooltip text="control plane" term_id="control-plane" >}}.
12+
Kubernetes {{< param "version" >}} supports clusters with up to 5000 nodes. More specifically,
13+
Kubernetes is designed to accommodate configurations that meet *all* of the following criteria:
1214

15+
* No more than 100 pods per node
1316
* No more than 5000 nodes
1417
* No more than 150000 total pods
1518
* No more than 300000 total containers
16-
* No more than 100 pods per node
17-
18-
19-
## Setup
20-
21-
A cluster is a set of nodes (physical or virtual machines) running Kubernetes agents, managed by a "master" (the cluster-level control plane).
2219

23-
Normally the number of nodes in a cluster is controlled by the value `NUM_NODES` in the platform-specific `config-default.sh` file (for example, see [GCE's `config-default.sh`](https://releases.k8s.io/{{< param "githubbranch" >}}/cluster/gce/config-default.sh)).
20+
You can scale your cluster by adding or removing nodes. The way you do this depends
21+
on how your cluster is deployed.
2422

25-
Simply changing that value to something very large, however, may cause the setup script to fail for many cloud providers. A GCE deployment, for example, will run in to quota issues and fail to bring the cluster up.
23+
## Cloud provider resource quotas {#quota-issues}
2624

27-
When setting up a large Kubernetes cluster, the following issues must be considered.
28-
29-
### Quota Issues
30-
31-
To avoid running into cloud provider quota issues, when creating a cluster with many nodes, consider:
32-
33-
* Increase the quota for things like CPU, IPs, etc.
34-
* In [GCE, for example,](https://cloud.google.com/compute/docs/resource-quotas) you'll want to increase the quota for:
25+
To avoid running into cloud provider quota issues, when creating a cluster with many nodes,
26+
consider:
27+
* Request a quota increase for cloud resources such as:
28+
* Computer instances
3529
* CPUs
36-
* VM instances
37-
* Total persistent disk reserved
30+
* Storage volumes
3831
* In-use IP addresses
39-
* Firewall Rules
40-
* Forwarding rules
41-
* Routes
42-
* Target pools
43-
* Gating the setup script so that it brings up new node VMs in smaller batches with waits in between, because some cloud providers rate limit the creation of VMs.
44-
45-
### Etcd storage
46-
47-
To improve performance of large clusters, we store events in a separate dedicated etcd instance.
32+
* Packet filtering rule sets
33+
* Number of load balancers
34+
* Network subnets
35+
* Log streams
36+
* Gate the cluster scaling actions to brings up new nodes in batches, with a pause
37+
between batches, because some cloud providers rate limit the creation of new instances.
4838

49-
When creating a cluster, existing salt scripts:
39+
## Control plane components
5040

51-
* start and configure additional etcd instance
52-
* configure api-server to use it for storing events
53-
54-
### Size of master and master components
55-
56-
On GCE/Google Kubernetes Engine, and AWS, `kube-up` automatically configures the proper VM size for your master depending on the number of nodes
57-
in your cluster. On other providers, you will need to configure it manually. For reference, the sizes we use on GCE are
41+
For a large cluster, you need a control plane with sufficient compute and other
42+
resources.
5843

59-
* 1-5 nodes: n1-standard-1
60-
* 6-10 nodes: n1-standard-2
61-
* 11-100 nodes: n1-standard-4
62-
* 101-250 nodes: n1-standard-8
63-
* 251-500 nodes: n1-standard-16
64-
* more than 500 nodes: n1-standard-32
44+
Typically you would run one or two control plane instances per failure zone,
45+
scaling those instances vertically first and then scaling horizontally after reaching
46+
the point of falling returns to (vertical) scale.
6547

66-
And the sizes we use on AWS are
48+
### etcd storage
6749

68-
* 1-5 nodes: m3.medium
69-
* 6-10 nodes: m3.large
70-
* 11-100 nodes: m3.xlarge
71-
* 101-250 nodes: m3.2xlarge
72-
* 251-500 nodes: c4.4xlarge
73-
* more than 500 nodes: c4.8xlarge
50+
To improve performance of large clusters, you can store Event objects in a separate
51+
dedicated etcd instance.
7452

75-
{{< note >}}
76-
On Google Kubernetes Engine, the size of the master node adjusts automatically based on the size of your cluster. For more information, see [this blog post](https://cloudplatform.googleblog.com/2017/11/Cutting-Cluster-Management-Fees-on-Google-Kubernetes-Engine.html).
53+
When creating a cluster, you can (using custom tooling):
7754

78-
On AWS, master node sizes are currently set at cluster startup time and do not change, even if you later scale your cluster up or down by manually removing or adding nodes or using a cluster autoscaler.
79-
{{< /note >}}
55+
* start and configure additional etcd instance
56+
* configure the {{< glossary_tooltip term_id="kube-apiserver" text="API server" >}} to use it for storing events
8057

81-
### Addon Resources
58+
## Addon resources
8259

83-
To prevent memory leaks or other resource issues in [cluster addons](https://releases.k8s.io/{{< param "githubbranch" >}}/cluster/addons) from consuming all the resources available on a node, Kubernetes sets resource limits on addon containers to limit the CPU and Memory resources they can consume (See PR [#10653](https://pr.k8s.io/10653/files) and [#10778](https://pr.k8s.io/10778/files)).
60+
To prevent memory leaks or other resource issues in cluster
61+
{{< glossary_tooltip text="addon" term_id="addons" >}} from consuming all the resources
62+
available on a node, Kubernetes sets
63+
[resource limits](/docs/concepts/configuration/manage-resources-containers/) on addon
64+
Pods to limit the amount of CPU and memory that they can consume.
8465

8566
For example:
8667

8768
```yaml
8869
containers:
8970
- name: fluentd-cloud-logging
90-
image: k8s.gcr.io/fluentd-gcp:1.16
71+
image: fluent/fluentd-kubernetes-daemonset:v1
9172
resources:
9273
limits:
9374
cpu: 100m
9475
memory: 200Mi
9576
```
9677
97-
Except for Heapster, these limits are static and are based on data we collected from addons running on 4-node clusters (see [#10335](https://issue.k8s.io/10335#issuecomment-117861225)). The addons consume a lot more resources when running on large deployment clusters (see [#5880](http://issue.k8s.io/5880#issuecomment-113984085)). So, if a large cluster is deployed without adjusting these values, the addons may continuously get killed because they keep hitting the limits.
98-
99-
To avoid running into cluster addon resource issues, when creating a cluster with many nodes, consider the following:
100-
101-
* Scale memory and CPU limits for each of the following addons, if used, as you scale up the size of cluster (there is one replica of each handling the entire cluster so memory and CPU usage tends to grow proportionally with size/load on cluster):
102-
* [InfluxDB and Grafana](https://releases.k8s.io/{{< param "githubbranch" >}}/cluster/addons/cluster-monitoring/influxdb/influxdb-grafana-controller.yaml)
103-
* [kubedns, dnsmasq, and sidecar](https://releases.k8s.io/{{< param "githubbranch" >}}/cluster/addons/dns/kube-dns/kube-dns.yaml.in)
104-
* [Kibana](https://releases.k8s.io/{{< param "githubbranch" >}}/cluster/addons/fluentd-elasticsearch/kibana-deployment.yaml)
105-
* Scale number of replicas for the following addons, if used, along with the size of cluster (there are multiple replicas of each so increasing replicas should help handle increased load, but, since load per replica also increases slightly, also consider increasing CPU/memory limits):
106-
* [elasticsearch](https://releases.k8s.io/{{< param "githubbranch" >}}/cluster/addons/fluentd-elasticsearch/es-statefulset.yaml)
107-
* Increase memory and CPU limits slightly for each of the following addons, if used, along with the size of cluster (there is one replica per node but CPU/memory usage increases slightly along with cluster load/size as well):
108-
* [FluentD with ElasticSearch Plugin](https://releases.k8s.io/{{< param "githubbranch" >}}/cluster/addons/fluentd-elasticsearch/fluentd-es-ds.yaml)
109-
* [FluentD with GCP Plugin](https://releases.k8s.io/{{< param "githubbranch" >}}/cluster/addons/fluentd-gcp/fluentd-gcp-ds.yaml)
110-
111-
Heapster's resource limits are set dynamically based on the initial size of your cluster (see [#16185](http://issue.k8s.io/16185)
112-
and [#22940](http://issue.k8s.io/22940)). If you find that Heapster is running
113-
out of resources, you should adjust the formulas that compute heapster memory request (see those PRs for details).
114-
115-
For directions on how to detect if addon containers are hitting resource limits, see the
116-
[Troubleshooting section of Compute Resources](/docs/concepts/configuration/manage-resources-containers/#troubleshooting).
117-
118-
### Allowing minor node failure at startup
119-
120-
For various reasons (see [#18969](https://github.com/kubernetes/kubernetes/issues/18969) for more details) running
121-
`kube-up.sh` with a very large `NUM_NODES` may fail due to a very small number of nodes not coming up properly.
122-
Currently you have two choices: restart the cluster (`kube-down.sh` and then `kube-up.sh` again), or before
123-
running `kube-up.sh` set the environment variable `ALLOWED_NOTREADY_NODES` to whatever value you feel comfortable
124-
with. This will allow `kube-up.sh` to succeed with fewer than `NUM_NODES` coming up. Depending on the
125-
reason for the failure, those additional nodes may join later or the cluster may remain at a size of
126-
`NUM_NODES - ALLOWED_NOTREADY_NODES`.
127-
78+
These limits are static and are based on data collected from addons running on
79+
small clusters. Most addons consume a lot more resources when running on large
80+
clusters. So, if a large cluster is deployed without adjusting these values, the
81+
addon(s) may continuously get killed because they keep hitting the limits.
82+
83+
To avoid running into cluster addon resource issues, when creating a cluster with
84+
many nodes, consider the following:
85+
86+
* Some addons scale vertically - there is one replica of the addon for the cluster
87+
or serving a whole failure zone. For these addons, increase requests and limits
88+
as you scale out your cluster.
89+
* Many addons scale horizontally - you add capacity by running more pods - but with
90+
a very large cluster you may also need to raise CPU or memory limits slightly.
91+
The VerticalPodAutoscaler can run in _recommender_ mode to provide suggested
92+
figures for requests and limits.
93+
* Some addons run as one copy per node, controlled by a {{< glossary_tooltip text="DaemonSet"
94+
term_id="daemonset" >}}: for example, a node-level log aggregator. Similar to
95+
the case with horizontally-scaled addons, you may also need to raise CPU or memory
96+
limits slightly.
97+
98+
## {{% heading "whatsnext" %}}
99+
100+
`VerticalPodAutoscaler` is a custom resource that you can deploy into your cluster
101+
to help you manage resource requests and limits for pods.
102+
Visit [Vertical Pod Autoscaler](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler#readme)
103+
to learn more about `VerticalPodAutoscaler` and how you can use it to scale cluster
104+
components, including cluster-critical addons.
105+
106+
The [cluster autoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler#readme)
107+
integrates with a number of cloud providers to help you run the right number of
108+
nodes for the level of resource demand in your cluster.

0 commit comments

Comments
 (0)