@@ -91,13 +91,14 @@ Do the following when restoring your file system:
9191 ``refuse_client_session `` file-system setting to prevent new sessions from
9292 connecting to the CephFS.
9393
94- * **Extend the MDS heartbeat grace period. ** This avoids replacing an MDS that
95- appears "stuck" during some operation. Sometimes recovery of an MDS may
96- involve an operation that takes longer than expected (from the programmer's
97- perspective). This is more likely when recovery is already taking longer than
98- normal to complete (indicated by your reading this document). Avoid
99- unnecessary replacement loops by running the following command and extending
100- the heartbeat grace period:
94+ * **Extend the MDS heartbeat grace period. ** Doing this causes the system to
95+ avoid replacing an MDS that becomes "stuck" during an operation. Sometimes
96+ recovery of an MDS may involve operations that take longer than expected
97+ (from the programmer's perspective). This is more likely when recovery has
98+ already taken longer than normal to complete (which, if you're reading this
99+ document, is likely the situation you find yourself in). Avoid unnecessary
100+ replacement loops by running the following command and extending the
101+ heartbeat grace period:
101102
102103 .. prompt :: bash #
103104
@@ -111,19 +112,21 @@ Do the following when restoring your file system:
111112* **Disable open-file-table prefetch. ** Under normal circumstances, the MDS
112113 prefetches directory contents during recovery as a way of heating up its
113114 cache. During a long recovery, the cache is probably already hot **and
114- large **. So this behavior is unnecessary and can be undesirable. Disable
115- open-file-table prefetching by running the following command:
115+ large **. If the cache is already hot and large, this prefetching is
116+ unnecessary and can be undesirable. Disable open-file-table prefetching by
117+ running the following command:
116118
117119 .. prompt :: bash #
118120
119121 ceph config set mds mds_oft_prefetch_dirfrags false
120122
121123* **Turn off clients. ** Clients that reconnect to the newly ``up:active `` MDS
122124 can create new load on the file system just as it is becoming operational.
123- Maintenance is often necessary before allowing clients to connect to the file
124- system and resuming a regular workload. For example, expediting the trimming
125- of journals may be advisable if the recovery took a long time because replay
126- was reading a very large journal.
125+ This is often undesirable. Maintenance is often necessary before allowing
126+ clients to connect to the file system and before resuming a regular workload.
127+ For example, expediting the trimming of journals may be advisable if the
128+ recovery took a long time due to the amount of time replay spent in reading a
129+ very large journal.
127130
128131 Client sessions can be refused manually, or by using the
129132 ``refuse_client_session `` tunable as in the following command:
@@ -135,9 +138,9 @@ Do the following when restoring your file system:
135138 This command has the effect of preventing clients from establishing new
136139 sessions with the MDS.
137140
138- * **Do not tweak max_mds. ** Modifying the file system setting variable
139- ``max_mds `` is sometimes thought to be good step during troubleshooting or
140- recovery. But modifying ``max_mds `` might have the effect of further
141+ * **Do not tweak max_mds. ** Modifying the file- system setting variable
142+ ``max_mds `` may seem like a good idea during troubleshooting and recovery,
143+ but it probably isn't. Modifying ``max_mds `` might have the effect of further
141144 destabilizing the cluster. If ``max_mds `` must be changed in such
142145 circumstances, run the command to change ``max_mds `` with the confirmation
143146 flag (``--yes-i-really-mean-it ``).
0 commit comments