@@ -175,7 +175,32 @@ Ceph
175
175
----
176
176
177
177
The following guide provides a good overview:
178
- https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/8/html/director_installation_and_usage/sect-rebooting-ceph
178
+ `<https://www.croit.io/blog/how-not-to-shut-down-a-ceph-cluster >`_.
179
+
180
+ #. Check that the cluster is healthy (i.e. ``ceph -s ``). Where possible, solve
181
+ or isolate any issues before the shutdown e.g. by marking unhealthy OSDs as
182
+ 'out' in the cluster.
183
+
184
+ #. Stop all clients. This includes
185
+
186
+ * **All ** OpenStack VMs (if their storage is RBD based).
187
+
188
+ * CephFS mounts.
189
+
190
+ * Ceph-backed openstack services such as Glance, Cinder, and RGW/S3/Swift.
191
+
192
+ #. Set the ``noout `` flag, so that the cluster does not attempt to redistribute
193
+ data when OSDs go down. Use the following command on a MON node:
194
+
195
+ .. code-block :: console
196
+
197
+ sudo cephadm shell -- ceph osd set noout
198
+
199
+ #. Shut down all the nodes, with those holding MON services last.
200
+
201
+ Note that if it is not desired for Ceph services to automatically start later
202
+ with the operating system, extra steps need to be taken and are not described
203
+ here.
179
204
180
205
Shutting down the seed VM
181
206
-------------------------
@@ -201,6 +226,23 @@ following order:
201
226
* Shut down seed VM
202
227
* Shut down Ansible control host
203
228
229
+ Full startup
230
+ -------------
231
+
232
+ If the entire control plane is powered down, it is best to bring the nodes up
233
+ in the reverse order of shutdown:
234
+
235
+ * Power on Ansible control host
236
+ * Power on seed VM (and other service VMs)
237
+ * Power on Ceph nodes (if applicable)
238
+ * Be sure to unset the ``noout `` flag, so that the cluster can
239
+ rebalance itself.
240
+ * Power on controllers
241
+ * Power on network nodes (if separate from controllers)
242
+ * Power on monitoring node (if separate from controllers)
243
+ * Power on compute nodes
244
+ * Power on virtual machines
245
+
204
246
Rebooting a node
205
247
----------------
206
248
0 commit comments