Skip to content

Commit f44fc9f

Browse files
authored
Merge pull request ceph#64921 from zdover23/wip-doc-2025-08-09-cephfs-troubleshooting
doc/cephfs: edit troubleshooting.rst Reviewed-by: Anthony D'Atri <[email protected]>
2 parents 056bea4 + edb3d2b commit f44fc9f

File tree

1 file changed

+21
-15
lines changed

1 file changed

+21
-15
lines changed

doc/cephfs/troubleshooting.rst

Lines changed: 21 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -260,24 +260,30 @@ the developers!
260260

261261
Slow requests (MDS)
262262
-------------------
263-
You can list current operations via the admin socket by running::
263+
List current operations via the admin socket by running the following command
264+
from the MDS host:
264265

265-
ceph daemon mds.<name> dump_ops_in_flight
266+
.. prompt:: bash #
267+
268+
ceph daemon mds.<name> dump_ops_in_flight
266269

267-
from the MDS host. Identify the stuck commands and examine why they are stuck.
270+
Identify the stuck commands and examine why they are stuck.
268271
Usually the last "event" will have been an attempt to gather locks, or sending
269-
the operation off to the MDS log. If it is waiting on the OSDs, fix them. If
270-
operations are stuck on a specific inode, you probably have a client holding
271-
caps which prevent others from using it, either because the client is trying
272-
to flush out dirty data or because you have encountered a bug in CephFS'
273-
distributed file lock code (the file "capabilities" ["caps"] system).
274-
275-
If it's a result of a bug in the capabilities code, restarting the MDS
276-
is likely to resolve the problem.
277-
278-
If there are no slow requests reported on the MDS, and it is not reporting
279-
that clients are misbehaving, either the client has a problem or its
280-
requests are not reaching the MDS.
272+
the operation off to the MDS log. If it is waiting on the OSDs, fix them.
273+
274+
If operations are stuck on a specific inode, then a client is likely holding
275+
capabilities, preventing its use by other clients. This situation can be caused
276+
by a client trying to flush dirty data, but it might be caused because you have
277+
encountered a bug in the distributed file lock code (the file "capabilities"
278+
["caps"] system) of CephFS.
279+
280+
If you have determined that the commands are stuck because of a bug in the
281+
capabilities code, restart the MDS. Restarting the MDS is likely to resolve the
282+
problem.
283+
284+
If there are no slow requests reported on the MDS, and there is no indication
285+
that clients are misbehaving, then either there is a problem with the client
286+
or the client's requests are not reaching the MDS.
281287

282288
.. _ceph_fuse_debugging:
283289

0 commit comments

Comments
 (0)