Skip to content

Commit ebf66bf

Browse files
authored
Merge pull request ceph#64854 from zdover23/wip-doc-2025-08-06-cephfs-troubleshooting-stuck-during-recovery
doc/cephfs: edit troubleshooting.rst Reviewed-by: Anthony D'Atri <[email protected]>
2 parents e03fe65 + d676399 commit ebf66bf

File tree

1 file changed

+58
-46
lines changed

1 file changed

+58
-46
lines changed

doc/cephfs/troubleshooting.rst

Lines changed: 58 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -76,90 +76,102 @@ the progression of the read position to compute the expected time to complete.
7676
Avoiding recovery roadblocks
7777
----------------------------
7878

79-
When trying to urgently restore your file system during an outage, here are some
80-
things to do:
79+
Do the following when restoring your file system:
8180

82-
* **Deny all reconnect to clients.** This effectively blocklists all existing
83-
CephFS sessions so all mounts will hang or become unavailable.
81+
* **Deny all reconnection to clients.** Blocklist all existing CephFS sessions,
82+
causing all mounts to hang or become unavailable:
8483

85-
.. code:: bash
84+
.. prompt:: bash #
8685

8786
ceph config set mds mds_deny_all_reconnect true
8887

8988
Remember to undo this after the MDS becomes active.
9089

91-
.. note:: This does not prevent new sessions from connecting. For that, see the ``refuse_client_session`` file system setting.
90+
.. note:: This does not prevent new sessions from connecting. Use the
91+
``refuse_client_session`` file-system setting to prevent new sessions from
92+
connecting to the CephFS.
9293

93-
* **Extend the MDS heartbeat grace period**. This avoids replacing an MDS that appears
94-
"stuck" doing some operation. Sometimes recovery of an MDS may involve an
95-
operation that may take longer than expected (from the programmer's
96-
perspective). This is more likely when recovery is already taking a longer than
97-
normal amount of time to complete (indicated by your reading this document).
98-
Avoid unnecessary replacement loops by extending the heartbeat graceperiod:
94+
* **Extend the MDS heartbeat grace period.** This avoids replacing an MDS that
95+
appears "stuck" during some operation. Sometimes recovery of an MDS may
96+
involve an operation that takes longer than expected (from the programmer's
97+
perspective). This is more likely when recovery is already taking longer than
98+
normal to complete (indicated by your reading this document). Avoid
99+
unnecessary replacement loops by running the following command and extending
100+
the heartbeat grace period:
99101

100-
.. code:: bash
102+
.. prompt:: bash #
101103

102-
ceph config set mds mds_heartbeat_grace 3600
104+
ceph config set mds mds_heartbeat_grace 3600
103105

104-
.. note:: This has the effect of having the MDS continue to send beacons to the monitors
105-
even when its internal "heartbeat" mechanism has not been reset (beat) in one
106-
hour. The previous mechanism for achieving this was via the
107-
`mds_beacon_grace` monitor setting.
106+
.. note:: This causes the MDS to continue to send beacons to the monitors
107+
even when its internal "heartbeat" mechanism has not been reset (it has
108+
not beaten) in one hour. In the past, this was achieved with the
109+
``mds_beacon_grace`` monitor setting.
108110

109-
* **Disable open file table prefetch.** Normally, the MDS will prefetch
110-
directory contents during recovery to heat up its cache. During long
111-
recovery, the cache is probably already hot **and large**. So this behavior
112-
can be undesirable. Disable using:
111+
* **Disable open-file-table prefetch.** Under normal circumstances, the MDS
112+
prefetches directory contents during recovery as a way of heating up its
113+
cache. During a long recovery, the cache is probably already hot **and
114+
large**. So this behavior is unnecessary and can be undesirable. Disable
115+
open-file-table prefetching by running the following command:
113116

114-
.. code:: bash
117+
.. prompt:: bash #
115118

116119
ceph config set mds mds_oft_prefetch_dirfrags false
117120

118-
* **Turn off clients.** Clients reconnecting to the newly ``up:active`` MDS may
119-
cause new load on the file system when it's just getting back on its feet.
120-
There will likely be some general maintenance to do before workloads should be
121-
resumed. For example, expediting journal trim may be advisable if the recovery
122-
took a long time because replay was reading a overly large journal.
121+
* **Turn off clients.** Clients that reconnect to the newly ``up:active`` MDS
122+
can create new load on the file system just as it is becoming operational.
123+
Maintenance is often necessary before allowing clients to connect to the file
124+
system and resuming a regular workload. For example, expediting the trimming
125+
of journals may be advisable if the recovery took a long time because replay
126+
was reading a very large journal.
123127

124-
You can do this manually or use the new file system tunable:
128+
Client sessions can be refused manually, or by using the
129+
``refuse_client_session`` tunable as in the following command:
125130

126-
.. code:: bash
131+
.. prompt:: bash #
127132

128133
ceph fs set <fs_name> refuse_client_session true
129134

130-
That prevents any clients from establishing new sessions with the MDS.
135+
This command has the effect of preventing clients from establishing new
136+
sessions with the MDS.
131137

132-
* **Dont tweak max_mds** Modifying the FS setting variable ``max_mds`` is
133-
sometimes perceived as a good step during troubleshooting or recovery effort.
134-
Instead, doing so might further destabilize the cluster. If ``max_mds`` must
135-
be changed in such circumstances, run the command to change ``max_mds`` with
136-
the confirmation flag (``--yes-i-really-mean-it``)
138+
* **Do not tweak max_mds.** Modifying the file system setting variable
139+
``max_mds`` is sometimes thought to be good step during troubleshooting or
140+
recovery. But modifying ``max_mds`` might have the effect of further
141+
destabilizing the cluster. If ``max_mds`` must be changed in such
142+
circumstances, run the command to change ``max_mds`` with the confirmation
143+
flag (``--yes-i-really-mean-it``).
137144

138145
.. _pause-purge-threads:
139146

140-
* **Turn off async purge threads** The volumes plugin spawns threads for
141-
asynchronously purging trashed/deleted subvolumes. To help troubleshooting or
142-
recovery effort, these purge threads can be disabled using:
147+
* **Turn off async purge threads.** The volumes plugin spawns threads that
148+
asynchronously purge trashed or deleted subvolumes. During troubleshooting or
149+
recovery, these purge threads can be disabled by running the following
150+
command:
143151

144-
.. code:: bash
152+
.. prompt:: bash #
145153

146154
ceph config set mgr mgr/volumes/pause_purging true
147155

148-
To resume purging run::
156+
To resume purging, run the following command:
157+
158+
.. prompt:: bash #
149159

150160
ceph config set mgr mgr/volumes/pause_purging false
151161

152162
.. _pause-clone-threads:
153163

154-
* **Turn off async cloner threads** The volumes plugin spawns threads for
155-
asynchronously cloning subvolume snapshots. To help troubleshooting or
156-
recovery effort, these cloner threads can be disabled using:
164+
* **Turn off async cloner threads.** The volumes plugin spawns threads that
165+
asynchronously clone subvolume snapshots. During troubleshooting or recovery,
166+
these cloner threads can be disabled by running the following command:
157167

158-
.. code:: bash
168+
.. prompt:: bash #
159169

160170
ceph config set mgr mgr/volumes/pause_cloning true
161171

162-
To resume cloning run::
172+
To resume cloning, run the following command:
173+
174+
.. prompt:: bash #
163175

164176
ceph config set mgr mgr/volumes/pause_cloning false
165177

0 commit comments

Comments
 (0)