Skip to content

Commit 8102e79

Browse files
authored
Merge pull request kubernetes#3707 from wangshiqi308/pr_update_readme
update readme file
2 parents dfc5619 + ba490bc commit 8102e79

File tree

1 file changed

+158
-47
lines changed
  • cluster-autoscaler/cloudprovider/huaweicloud

1 file changed

+158
-47
lines changed

cluster-autoscaler/cloudprovider/huaweicloud/README.md

Lines changed: 158 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,11 @@
11
# Cluster Autoscaler on Huawei Cloud
22

33
## Overview
4-
The cluster autoscaler for [Huawei ServiceStage](https://www.huaweicloud.com/intl/en-us/product/servicestage.html) scales worker nodes within any
5-
specified container cluster's node pool where the `Autoscaler` label is on.
4+
The cluster autoscaler works with self-built Kubernetes cluster on [Huaweicloud ECS](https://www.huaweicloud.com/intl/en-us/product/ecs.html) and
5+
specified [Huaweicloud Auto Scaling Groups](https://www.huaweicloud.com/intl/en-us/product/as.html)
66
It runs as a Deployment on a worker node in the cluster. This README will go over some of the necessary steps required
77
to get the cluster autoscaler up and running.
88

9-
Note:
10-
11-
1. Cluster autoscaler must be run on a version of Huawei container engine 1.15.6 or later.
12-
2. Node pool attached to the cluster must have the `Autoscaler` flag turned on, and minimum number of nodes and maximum
13-
number of nodes being set. Node pools can be managed by `Resource Management` from Huawei container engine console.
14-
3. If warnings about installing `autoscaler addon` are encountered after creating a node pool with `Autoscaler` flag on,
15-
just ignore this warning and DO NOT install the addon.
16-
4. Do not build your image in a Huawei Cloud ECS. Build the image in a machine that has access to the Google Container Registry (GCR).
17-
189
## Deployment Steps
1910
### Build Image
2011
#### Environment
@@ -76,7 +67,130 @@ The following steps use Huawei SoftWare Repository for Container (SWR) as an exa
7667
7768
5. For the cluster autoscaler to function normally, make sure the `Sharing Type` of the image is `Public`.
7869
If the cluster has trouble pulling the image, go to SWR console and check whether the `Sharing Type` of the image is
79-
`Private`. If it is, click `Edit` button on top right and set the `Sharing Type` to `Public`.
70+
`Private`. If it is, click `Edit` button on top right and set the `Sharing Type` to `Public`.
71+
72+
73+
## Build Kubernetes Cluster on ECS
74+
75+
### 1. Install kubelet, kubeadm and kubectl
76+
77+
Please see installation [here](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/)
78+
79+
For example:
80+
- OS: CentOS 8
81+
- Note: The following example should be run on ECS that has access to the Google Container Registry (GCR)
82+
```bash
83+
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
84+
[kubernetes]
85+
name=Kubernetes
86+
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-\$basearch
87+
enabled=1
88+
gpgcheck=1
89+
repo_gpgcheck=1
90+
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/ doc/rpm-package-key.gpg
91+
exclude=kubelet kubeadm kubectl
92+
EOF
93+
```
94+
95+
```
96+
sudo setenforce 0
97+
sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
98+
99+
sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
100+
101+
sudo systemctl enable --now kubelet
102+
```
103+
### 2. Install Docker
104+
Please see installation [here](https://docs.docker.com/engine/install/)
105+
106+
For example:
107+
- OS: CentOS 8
108+
- Note: The following example should be run on ECS that has access to the Google Container Registry (GCR)
109+
110+
```bash
111+
sudo yum install -y yum-utils
112+
113+
sudo yum-config-manager \
114+
--add-repo \
115+
https://download.docker.com/linux/centos/docker-ce.repo
116+
117+
sudo yum install docker-ce docker-ce-cli containerd.io
118+
119+
sudo systemctl start docker
120+
```
121+
122+
### 3. Initialize Cluster
123+
```bash
124+
kubeadm init
125+
126+
mkdir -p $HOME/.kube
127+
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
128+
sudo chown $(id -u):$(id -g) $HOME/.kube/config
129+
```
130+
131+
### 4. Install Flannel Network
132+
```bash
133+
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
134+
```
135+
### 5. Generate Token
136+
```bash
137+
kubeadm token create -ttl 0
138+
```
139+
Generate a token that never expires. Remember this token since it will be used later.
140+
141+
Get hash key. Remember the key since it will be used later.
142+
```
143+
openssl x509 -in /etc/kubernetes/pki/ca.crt -noout -pubkey | openssl rsa -pubin -outform DER 2>/dev/null | sha256sum | cut -d' ' -f1
144+
```
145+
146+
### 6. Create OS Image with K8S Tools
147+
- Launch a new ECS instance, and install Kubeadm, Kubectl and docker.
148+
- Create a script to join the new instance into the k8s cluster.
149+
```bash
150+
cat <<EOF >/etc/rc.d/init.d/init-k8s.sh
151+
#!bin/bash
152+
#chkconfig: 2345 80 90
153+
setenforce 0
154+
swapoff -a
155+
156+
yum install -y kubelet
157+
systemctl start docker
158+
159+
kubeadm join --token $TOKEN $API_Server_EndPoint --discovery-token-ca-cert-hash $HASHKEY
160+
EOF
161+
```
162+
- Add this script into chkconfig, to let it run automatically after the instance is started.
163+
```
164+
chmod +x /etc/rc.d/init.d/init-k8s.sh
165+
chkconfig --add /etc/rc.d/init.d/init-k8s.sh
166+
chkconfig /etc/rc.d/init.d/init-k8s.sh on
167+
```
168+
- Copy `~/.kube/config` from master node to this ECS `~./kube/config` to setup kubectl on this instance.
169+
170+
- Go to Huawei Cloud `Image Management` Service and click on `Create Image`. Select type `System disk image`, select your ECS instance as `Source`, then give it a name and then create.
171+
172+
- Remember this ECS instance ID since it will be used later.
173+
174+
### 7. Create AS Group
175+
- Follow the Huawei cloud instruction to create an AS Group.
176+
- Create an AS Configuration, and select private image which we just created.
177+
- While creating the `AS Configuration`, add the following script into `Advanced Settings`.
178+
```bash
179+
#!bin/bash
180+
181+
IDS=$(ls /var/lib/cloud/instances/)
182+
while true
183+
do
184+
for ID in $IDS
185+
do
186+
if [ $ID != $ECS_INSTANCE_ID ]; then
187+
/usr/bin/kubectl --kubeconfig ~/.kube/config patch node $HOSTNAME -p "{\"spec\":{\"providerID\":\"$ID\"}}"
188+
fi
189+
done
190+
sleep 30
191+
done
192+
```
193+
- Bind the AS Group with this AS Configuration
80194
81195
### Deploy Cluster Autoscaler
82196
#### Configure credentials
@@ -99,6 +213,24 @@ The following parameters are required in the Secret object yaml file:
99213
identity-endpoint=https://iam.cn-north-4.myhuaweicloud.com/v3.0
100214
```
101215
216+
- `as-endpoint`
217+
218+
Find the as endpoint for different regions [here](https://developer.huaweicloud.com/endpoint?AS),
219+
220+
For example, for region `cn-north-4`, the endpoint is
221+
```
222+
as.cn-north-4.myhuaweicloud.com
223+
```
224+
225+
- `ecs-endpoint`
226+
227+
Find the ecs endpoint for different regions [here](https://developer.huaweicloud.com/endpoint?ECS),
228+
229+
For example, for region `cn-north-4`, the endpoint is
230+
```
231+
ecs.cn-north-4.myhuaweicloud.com
232+
```
233+
102234
- `project-id`
103235
104236
Follow this link to find the project-id: [Obtaining a Project ID](https://support.huaweicloud.com/en-us/api-servicestage/servicestage_api_0023.html)
@@ -127,19 +259,15 @@ and [My Credentials](https://support.huaweicloud.com/en-us/usermanual-ca/ca_01_0
127259
```
128260
{Minimum number of nodes}:{Maximum number of nodes}:{Node pool name}
129261
```
130-
The above parameters should match the parameters of the node pool you created. Currently, Huawei ServiceStage only provides
131-
autoscaling against a single node pool.
262+
The above parameters should match the parameters of the AS Group you created.
132263
133264
More configuration options can be added to the cluster autoscaler, such as `scale-down-delay-after-add`, `scale-down-unneeded-time`, etc.
134265
See available configuration options [here](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-are-the-parameters-to-ca).
135266
136267
#### Deploy cluster autoscaler on the cluster
137268
1. Log in to a machine which can manage the cluster with `kubectl`.
138269
139-
Make sure the machine has kubectl access to the cluster. We recommend using a worker node to manage the cluster. Follow
140-
the instructions for
141-
[Connecting to a Kubernetes Cluster Using kubectl](https://support.huaweicloud.com/intl/en-us/usermanual-cce/cce_01_0107.html)
142-
to set up kubectl access to the cluster if you cannot execute `kubectl` on your machine.
270+
Make sure the machine has kubectl access to the cluster.
143271
144272
2. Create the Service Account:
145273
```
@@ -163,25 +291,20 @@ kubectl get pods -n kube-system
163291
```
164292
165293
To see whether it functions correctly, deploy a Service to the cluster, and increase and decrease workload to the
166-
Service. Cluster autoscaler should be able to autoscale the node pool with `Autoscaler` on to accommodate the load.
294+
Service. Cluster autoscaler should be able to autoscale the AS Group to accommodate the load.
167295
168296
A simple testing method is like this:
169297
- Create a Service: listening for http request
170298
171-
- Create HPA or AOM policy for pods to be autoscaled
172-
* AOM policy: To create an AOM policy, go into the deployment, click `Scaling` tag and click `Add Scaling Policy`
173-
button on Huawei Cloud UI.
174-
* HPA policy: There're two ways to create an HPA policy.
175-
* Follow this instruction to create an HPA policy through UI:
176-
[Scaling a Workload](https://support.huaweicloud.com/intl/en-us/usermanual-cce/cce_01_0208.html)
177-
* Install [metrics server](https://github.com/kubernetes-sigs/metrics-server) by yourself and create an HPA policy
178-
by executing something like this:
179-
```
180-
kubectl autoscale deployment [Deployment name] --cpu-percent=50 --min=1 --max=10
181-
```
182-
The above command creates an HPA policy on the deployment with target average cpu usage of 50%. The number of
183-
pods will grow if average cpu usage is above 50%, and will shrink otherwise. The `min` and `max` parameters set
184-
the minimum and maximum number of pods of this deployment.
299+
- Create HPA policy for pods to be autoscaled
300+
* Install [metrics server](https://github.com/kubernetes-sigs/metrics-server) by yourself and create an HPA policy
301+
by executing something like this:
302+
```
303+
kubectl autoscale deployment [Deployment name] --cpu-percent=50 --min=1 --max=10
304+
```
305+
The above command creates an HPA policy on the deployment with target average cpu usage of 50%. The number of
306+
pods will grow if average cpu usage is above 50%, and will shrink otherwise. The `min` and `max` parameters set
307+
the minimum and maximum number of pods of this deployment.
185308
- Generate load to the above service
186309
187310
Example tools for generating workload to an http service are:
@@ -196,30 +319,18 @@ A simple testing method is like this:
196319
197320
Feel free to use other tools which have a similar function.
198321
199-
- Wait for pods to be added: as load increases, more pods will be added by HPA or AOM
322+
- Wait for pods to be added: as load increases, more pods will be added by HPA
200323
201324
- Wait for nodes to be added: when there's insufficient resource for additional pods, new nodes will be added to the
202325
cluster by the cluster autoscaler
203326
204327
- Stop the load
205328
206-
- Wait for pods to be removed: as load decreases, pods will be removed by HPA or AOM
329+
- Wait for pods to be removed: as load decreases, pods will be removed by HPA
207330
208331
- Wait for nodes to be removed: as pods being removed from nodes, several nodes will become underutilized or empty,
209332
and will be removed by the cluster autoscaler
210333
211-
212-
## Notes
213-
214-
1. Huawei ServiceStage does not yet support autoscaling against multiple node pools within a single cluster, but
215-
this is currently under development. For now, make sure that there's only one node pool with `Autoscaler` label
216-
on in the cluster.
217-
2. If the version of the cluster is v1.15.6 or older, log statements similar to the following may be present in the
218-
autoscaler pod logs:
219-
```
220-
E0402 13:25:05.472999 1 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.CSINode: the server could not find the requested resource
221-
```
222-
This is normal and will be fixed by a future version of cluster.
223334
224335
## Support & Contact Info
225336

0 commit comments

Comments
 (0)