@@ -238,20 +238,27 @@ See the :ref:`RADOS troubleshooting documentation<rados_troubleshooting>`.
238238The MDS
239239=======
240240
241- If an operation is hung inside the MDS, it will eventually show up in ``ceph health ``,
242- identifying "slow requests are blocked". It may also identify clients as
243- "failing to respond" or misbehaving in other ways. If the MDS identifies
244- specific clients as misbehaving, you should investigate why they are doing so.
241+ Run the ``ceph health `` command. Any operation that is hung in the MDS is
242+ indicated by the ``slow requests are blocked `` message.
245243
246- Generally it will be the result of
244+ Messages that read ``failing to respond `` indicate that a client is failing to
245+ respond.
247246
248- #. Overloading the system (if you have extra RAM, increase the
249- "mds cache memory limit" config from its default 1GiB; having a larger active
250- file set than your MDS cache is the #1 cause of this!).
247+ The following list details potential causes of hung operations:
251248
252- #. Running an older (misbehaving) client.
249+ #. The system is overloaded. The most likely cause of system overload is an
250+ active file set that is larger than the MDS cache.
251+
252+ If you have extra RAM, increase the ``mds_cache_memory_limit ``. The specific
253+ tunable ``mds_cache_memory_limit `` is discussed in the :ref: `MDS Cache
254+ Size<cephfs_cache_configuration_mds_cache_memory_limit>`. Read the :ref: `MDS
255+ Cache Configuration<cephfs_mds_cache_configuration>` section in full before
256+ making any alterations to the ``mds_cache_memory_limit `` tunable.
253257
254- #. Underlying RADOS issues.
258+ #. There is an older (misbehaving) client.
259+
260+ #. There are underlying RADOS issues. See :ref: `The RADOS troubleshooting
261+ documentation<rados_troubleshooting>`.
255262
256263Otherwise, you have probably discovered a new bug and should report it to
257264the developers!
0 commit comments