@@ -105,13 +105,14 @@ Do the following when restoring your file system:
105105 ``refuse_client_session `` file-system setting to prevent new sessions from
106106 connecting to the CephFS.
107107
108- * **Extend the MDS heartbeat grace period. ** This avoids replacing an MDS that
109- appears "stuck" during some operation. Sometimes recovery of an MDS may
110- involve an operation that takes longer than expected (from the programmer's
111- perspective). This is more likely when recovery is already taking longer than
112- normal to complete (indicated by your reading this document). Avoid
113- unnecessary replacement loops by running the following command and extending
114- the heartbeat grace period:
108+ * **Extend the MDS heartbeat grace period. ** Doing this causes the system to
109+ avoid replacing an MDS that becomes "stuck" during an operation. Sometimes
110+ recovery of an MDS may involve operations that take longer than expected
111+ (from the programmer's perspective). This is more likely when recovery has
112+ already taken longer than normal to complete (which, if you're reading this
113+ document, is likely the situation you find yourself in). Avoid unnecessary
114+ replacement loops by running the following command and extending the
115+ heartbeat grace period:
115116
116117 .. prompt :: bash #
117118
@@ -125,19 +126,21 @@ Do the following when restoring your file system:
125126* **Disable open-file-table prefetch. ** Under normal circumstances, the MDS
126127 prefetches directory contents during recovery as a way of heating up its
127128 cache. During a long recovery, the cache is probably already hot **and
128- large **. So this behavior is unnecessary and can be undesirable. Disable
129- open-file-table prefetching by running the following command:
129+ large **. If the cache is already hot and large, this prefetching is
130+ unnecessary and can be undesirable. Disable open-file-table prefetching by
131+ running the following command:
130132
131133 .. prompt :: bash #
132134
133135 ceph config set mds mds_oft_prefetch_dirfrags false
134136
135137* **Turn off clients. ** Clients that reconnect to the newly ``up:active `` MDS
136138 can create new load on the file system just as it is becoming operational.
137- Maintenance is often necessary before allowing clients to connect to the file
138- system and resuming a regular workload. For example, expediting the trimming
139- of journals may be advisable if the recovery took a long time because replay
140- was reading a very large journal.
139+ This is often undesirable. Maintenance is often necessary before allowing
140+ clients to connect to the file system and before resuming a regular workload.
141+ For example, expediting the trimming of journals may be advisable if the
142+ recovery took a long time due to the amount of time replay spent in reading a
143+ very large journal.
141144
142145 Client sessions can be refused manually, or by using the
143146 ``refuse_client_session `` tunable as in the following command:
@@ -149,9 +152,9 @@ Do the following when restoring your file system:
149152 This command has the effect of preventing clients from establishing new
150153 sessions with the MDS.
151154
152- * **Do not tweak max_mds. ** Modifying the file system setting variable
153- ``max_mds `` is sometimes thought to be good step during troubleshooting or
154- recovery. But modifying ``max_mds `` might have the effect of further
155+ * **Do not tweak max_mds. ** Modifying the file- system setting variable
156+ ``max_mds `` may seem like a good idea during troubleshooting and recovery,
157+ but it probably isn't. Modifying ``max_mds `` might have the effect of further
155158 destabilizing the cluster. If ``max_mds `` must be changed in such
156159 circumstances, run the command to change ``max_mds `` with the confirmation
157160 flag (``--yes-i-really-mean-it ``).
0 commit comments