Skip to content

Commit 58da0c9

Browse files
authored
Merge pull request ceph#64928 from zdover23/wip-doc-2025-08-10-cephfs-troubleshooting
doc/cephfs: edit troubleshooting.rst Reviewed-by: Anthony D'Atri <[email protected]>
2 parents f44fc9f + a17fd3f commit 58da0c9

File tree

1 file changed

+17
-10
lines changed

1 file changed

+17
-10
lines changed

doc/cephfs/troubleshooting.rst

Lines changed: 17 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -238,20 +238,27 @@ See the :ref:`RADOS troubleshooting documentation<rados_troubleshooting>`.
238238
The MDS
239239
=======
240240

241-
If an operation is hung inside the MDS, it will eventually show up in ``ceph health``,
242-
identifying "slow requests are blocked". It may also identify clients as
243-
"failing to respond" or misbehaving in other ways. If the MDS identifies
244-
specific clients as misbehaving, you should investigate why they are doing so.
241+
Run the ``ceph health`` command. Any operation that is hung in the MDS is
242+
indicated by the ``slow requests are blocked`` message.
245243

246-
Generally it will be the result of
244+
Messages that read ``failing to respond`` indicate that a client is failing to
245+
respond.
247246

248-
#. Overloading the system (if you have extra RAM, increase the
249-
"mds cache memory limit" config from its default 1GiB; having a larger active
250-
file set than your MDS cache is the #1 cause of this!).
247+
The following list details potential causes of hung operations:
251248

252-
#. Running an older (misbehaving) client.
249+
#. The system is overloaded. The most likely cause of system overload is an
250+
active file set that is larger than the MDS cache.
251+
252+
If you have extra RAM, increase the ``mds_cache_memory_limit``. The specific
253+
tunable ``mds_cache_memory_limit`` is discussed in the :ref:`MDS Cache
254+
Size<cephfs_cache_configuration_mds_cache_memory_limit>`. Read the :ref:`MDS
255+
Cache Configuration<cephfs_mds_cache_configuration>` section in full before
256+
making any alterations to the ``mds_cache_memory_limit`` tunable.
253257

254-
#. Underlying RADOS issues.
258+
#. There is an older (misbehaving) client.
259+
260+
#. There are underlying RADOS issues. See :ref:`The RADOS troubleshooting
261+
documentation<rados_troubleshooting>`.
255262

256263
Otherwise, you have probably discovered a new bug and should report it to
257264
the developers!

0 commit comments

Comments
 (0)