Skip to content

Commit 1f11d2f

Browse files
committed
docs: add info on adding and removing hosts
Forward-ported from Train commit I19c7f05b538a7abc9253194bf041c037b1998378. Change-Id: If07b84e0bbdcb7da8dbef87cc8826987f1d11cf8 (cherry picked from commit 03b8117)
1 parent 620bb29 commit 1f11d2f

File tree

3 files changed

+226
-0
lines changed

3 files changed

+226
-0
lines changed

doc/source/conf.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,4 +102,5 @@
102102
'oslo.messaging',
103103
'oslotest',
104104
'swift',
105+
'watcher',
105106
]
Lines changed: 224 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,224 @@
1+
=========================
2+
Adding and removing hosts
3+
=========================
4+
5+
This page discusses how to add and remove nodes from an existing cluster. The
6+
procedure differs depending on the type of nodes being added or removed, which
7+
services are running, and how they are configured. Here we will consider two
8+
types of nodes - controllers and compute nodes. Other types of nodes will need
9+
consideration.
10+
11+
Any procedure being used should be tested before being applied in a production
12+
environment.
13+
14+
Adding new hosts
15+
================
16+
17+
.. _adding-new-controllers:
18+
19+
Adding new controllers
20+
----------------------
21+
22+
The :doc:`bootstrap-servers command
23+
</reference/deployment-and-bootstrapping/bootstrap-servers>` can be used to
24+
prepare the new hosts that are being added to the system. It adds an entry to
25+
``/etc/hosts`` for the new hosts, and some services, such as RabbitMQ, require
26+
entries to exist for all controllers on every controller. If using a
27+
``--limit`` argument, ensure that all controllers are included, e.g. via
28+
``--limit control``. Be aware of the :ref:`potential issues <rebootstrapping>`
29+
with running ``bootstrap-servers`` on an existing system.
30+
31+
.. code-block:: console
32+
33+
kolla-ansible -i <inventory> bootstrap-servers [ --limit <limit> ]
34+
35+
Pull down container images to the new hosts. The ``--limit`` argument may be
36+
used and only needs to include the new hosts.
37+
38+
.. code-block:: console
39+
40+
kolla-ansible -i <inventory> pull [ --limit <limit> ]
41+
42+
Deploy containers to the new hosts. If using a ``--limit`` argument, ensure
43+
that all controllers are included, e.g. via ``--limit control``.
44+
45+
.. code-block:: console
46+
47+
kolla-ansible -i <inventory> deploy [ --limit <limit> ]
48+
49+
The new controllers are now deployed. It is recommended to perform testing
50+
of the control plane at this point to verify that the new controllers are
51+
functioning correctly.
52+
53+
Some resources may not be automatically balanced onto the new controllers. It
54+
may be helpful to manually rebalance these resources onto the new controllers.
55+
Examples include networks hosted by Neutron DHCP agent, and routers hosted by
56+
Neutron L3 agent. The `removing-existing-controllers`_ section provides an
57+
example of how to do this.
58+
59+
.. _adding-new-compute-nodes:
60+
61+
Adding new compute nodes
62+
------------------------
63+
64+
The :doc:`bootstrap-servers command
65+
</reference/deployment-and-bootstrapping/bootstrap-servers>`, can be used to
66+
prepare the new hosts that are being added to the system. Be aware of the
67+
:ref:`potential issues <rebootstrapping>` with running ``bootstrap-servers`` on
68+
an existing system.
69+
70+
.. code-block:: console
71+
72+
kolla-ansible -i <inventory> bootstrap-servers [ --limit <limit> ]
73+
74+
Pull down container images to the new hosts. The ``--limit`` argument may be
75+
used and only needs to include the new hosts.
76+
77+
.. code-block:: console
78+
79+
kolla-ansible -i <inventory> pull [ --limit <limit> ]
80+
81+
Deploy containers on the new hosts. The ``--limit`` argument may be used and
82+
only needs to include the new hosts.
83+
84+
.. code-block:: console
85+
86+
kolla-ansible -i <inventory> deploy [ --limit <limit> ]
87+
88+
The new compute nodes are now deployed. It is recommended to perform
89+
testing of the compute nodes at this point to verify that they are functioning
90+
correctly.
91+
92+
Server instances are not automatically balanced onto the new compute nodes. It
93+
may be helpful to live migrate some server instances onto the new hosts.
94+
95+
.. code-block:: console
96+
97+
openstack server migrate <server> --live-migration --host <target host> --os-compute-api-version 2.30
98+
99+
Alternatively, a service such as :watcher-doc:`Watcher </>` may be used to do
100+
this automatically.
101+
102+
Removing existing hosts
103+
=======================
104+
105+
.. _removing-existing-controllers:
106+
107+
Removing existing controllers
108+
-----------------------------
109+
110+
When removing controllers or other hosts running clustered services, consider
111+
whether enough hosts remain in the cluster to form a quorum. For example, in a
112+
system with 3 controllers, only one should be removed at a time. Consider also
113+
the effect this will have on redundancy.
114+
115+
Before removing existing controllers from a cluster, it is recommended to move
116+
resources they are hosting. Here we will cover networks hosted by Neutron DHCP
117+
agent and routers hosted by Neutron L3 agent. Other actions may be necessary,
118+
depending on your environment and configuration.
119+
120+
For each host being removed, find Neutron routers on that host and move them.
121+
Disable the L3 agent. For example:
122+
123+
.. code-block:: console
124+
125+
l3_id=$(openstack network agent list --host <host> --agent-type l3 -f value -c ID)
126+
target_l3_id=$(openstack network agent list --host <target host> --agent-type l3 -f value -c ID)
127+
openstack router list --agent $l3_id -f value -c ID | while read router; do
128+
openstack network agent remove router $l3_id $router --l3
129+
openstack network agent add router $target_l3_id $router --l3
130+
done
131+
openstack network agent set $l3_id --disable
132+
133+
Repeat for DHCP agents:
134+
135+
.. code-block:: console
136+
137+
dhcp_id=$(openstack network agent list --host <host> --agent-type dhcp -f value -c ID)
138+
target_dhcp_id=$(openstack network agent list --host <target host> --agent-type dhcp -f value -c ID)
139+
openstack network list --agent $dhcp_id -f value -c ID | while read network; do
140+
openstack network agent remove network $dhcp_id $network --dhcp
141+
openstack network agent add network $target_dhcp_id $network --dhcp
142+
done
143+
144+
Stop all services running on the hosts being removed:
145+
146+
.. code-block:: console
147+
148+
kolla-ansible -i <inventory> stop --yes-i-really-really-mean-it [ --limit <limit> ]
149+
150+
Remove the hosts from the Ansible inventory.
151+
152+
Reconfigure the remaining controllers to update the membership of clusters such
153+
as MariaDB and RabbitMQ. Use a suitable limit, such as ``--limit control``.
154+
155+
.. code-block:: console
156+
157+
kolla-ansible -i <inventory> deploy [ --limit <limit> ]
158+
159+
Perform testing to verify that the remaining cluster hosts are operating
160+
correctly.
161+
162+
For each host, clean up its services:
163+
164+
.. code-block:: console
165+
166+
openstack network agent list --host <host> -f value -c ID | while read id; do
167+
openstack network agent delete $id
168+
done
169+
170+
openstack compute service list --os-compute-api-version 2.53 --host <host> -f value -c ID | while read id; do
171+
openstack compute service delete --os-compute-api-version 2.53 $id
172+
done
173+
174+
.. _removing-existing-compute-nodes:
175+
176+
Removing existing compute nodes
177+
-------------------------------
178+
179+
When removing compute nodes from a system, consider whether there is capacity
180+
to host the running workload on the remaining compute nodes. Include overhead
181+
for failures that may occur.
182+
183+
Before removing compute nodes from a system, it is recommended to migrate or
184+
destroy any instances that they are hosting.
185+
186+
For each host, disable the compute service to ensure that no new instances are
187+
scheduled to it.
188+
189+
.. code-block:: console
190+
191+
openstack compute service set <host> nova-compute --disable
192+
193+
If possible, live migrate instances to another host.
194+
195+
.. code-block:: console
196+
197+
openstack server list --host <host> -f value -c ID | while read server; do
198+
openstack server migrate --live-migration $server
199+
done
200+
201+
Verify that the migrations were successful.
202+
203+
Stop all services running on the hosts being removed:
204+
205+
.. code-block:: console
206+
207+
kolla-ansible -i <inventory> stop --yes-i-really-really-mean-it [ --limit <limit> ]
208+
209+
Remove the hosts from the Ansible inventory.
210+
211+
Perform testing to verify that the remaining cluster hosts are operating
212+
correctly.
213+
214+
For each host, clean up its services:
215+
216+
.. code-block:: console
217+
218+
openstack network agent list --host <host> -f value -c ID | while read id; do
219+
openstack network agent delete $id
220+
done
221+
222+
openstack compute service list --os-compute-api-version 2.53 --host <host> -f value -c ID | while read id; do
223+
openstack compute service delete --os-compute-api-version 2.53 $id
224+
done

doc/source/user/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,5 +11,6 @@ User Guides
1111
multinode
1212
multi-regions
1313
operating-kolla
14+
adding-and-removing-hosts
1415
security
1516
troubleshooting

0 commit comments

Comments
 (0)