Skip to content

Commit eb5cebc

Browse files
authored
Merge pull request ceph#60919 from zdover23/wip-doc-2024-12-03-rados-ops-health-checks-2
doc/rados: fix sentences in health-checks (2 of x) Reviewed-by: Anthony D'Atri <[email protected]>
2 parents d70fa73 + ee0ef76 commit eb5cebc

File tree

1 file changed

+31
-34
lines changed

1 file changed

+31
-34
lines changed

doc/rados/operations/health-checks.rst

Lines changed: 31 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -639,9 +639,10 @@ command:
639639
BLUESTORE_FRAGMENTATION
640640
_______________________
641641

642-
As BlueStore operates, the free space on the underlying storage will become
643-
fragmented. This is normal and unavoidable, but excessive fragmentation causes
644-
slowdown. To inspect BlueStore fragmentation, run the following command:
642+
``BLUESTORE_FRAGMENTATION`` indicates that the free space that underlies
643+
BlueStore has become fragmented. This is normal and unavoidable, but excessive
644+
fragmentation causes slowdown. To inspect BlueStore fragmentation, run the
645+
following command:
645646

646647
.. prompt:: bash $
647648

@@ -680,11 +681,9 @@ One or more OSDs have BlueStore volumes that were created prior to the
680681
Nautilus release. (In Nautilus, BlueStore tracks its internal usage
681682
statistics on a granular, per-pool basis.)
682683

683-
If *all* OSDs
684-
are older than Nautilus, this means that the per-pool metrics are
685-
simply unavailable. But if there is a mixture of pre-Nautilus and
686-
post-Nautilus OSDs, the cluster usage statistics reported by ``ceph
687-
df`` will be inaccurate.
684+
If *all* OSDs are older than Nautilus, this means that the per-pool metrics are
685+
simply unavailable. But if there is a mixture of pre-Nautilus and post-Nautilus
686+
OSDs, the cluster usage statistics reported by ``ceph df`` will be inaccurate.
688687

689688
The old OSDs can be updated to use the new usage-tracking scheme by stopping
690689
each OSD, running a repair operation, and then restarting the OSD. For example,
@@ -796,7 +795,7 @@ about the source of the problem.
796795
BLUESTORE_SPURIOUS_READ_ERRORS
797796
______________________________
798797

799-
One or more BlueStore OSDs detect read errors on the main device.
798+
One (or more) BlueStore OSDs detects read errors on the main device.
800799
BlueStore has recovered from these errors by retrying disk reads. This alert
801800
might indicate issues with underlying hardware, issues with the I/O subsystem,
802801
or something similar. Such issues can cause permanent data
@@ -824,7 +823,7 @@ _______________________________
824823

825824
There are BlueStore log messages that reveal storage drive issues
826825
that can cause performance degradation and potentially data unavailability or
827-
loss. These may indicate a storage drive that is failing and should be
826+
loss. These may indicate a storage drive that is failing and should be
828827
evaluated and possibly removed and replaced.
829828

830829
``read stalled read 0x29f40370000~100000 (buffered) since 63410177.290546s, timeout is 5.000000s``
@@ -851,7 +850,7 @@ To change this, run the following command:
851850
ceph config set global bdev_stalled_read_warn_lifetime 10
852851
ceph config set global bdev_stalled_read_warn_threshold 5
853852

854-
this may be done for specific OSDs or a given mask. For example,
853+
This may be done for specific OSDs or a given mask. For example,
855854
to apply only to SSD OSDs:
856855

857856
.. prompt:: bash $
@@ -864,30 +863,28 @@ to apply only to SSD OSDs:
864863
WAL_DEVICE_STALLED_READ_ALERT
865864
_____________________________
866865

867-
The warning state ``WAL_DEVICE_STALLED_READ_ALERT`` is raised to
868-
indicate ``stalled read`` instances on a given BlueStore OSD's ``WAL_DEVICE``.
869-
This warning can be configured via the :confval:`bdev_stalled_read_warn_lifetime` and
870-
:confval:`bdev_stalled_read_warn_threshold` options with commands similar to those
871-
described in the
872-
``BLOCK_DEVICE_STALLED_READ_ALERT`` warning section.
866+
The warning state ``WAL_DEVICE_STALLED_READ_ALERT`` is raised to indicate
867+
``stalled read`` instances on a given BlueStore OSD's ``WAL_DEVICE``. This
868+
warning can be configured via the :confval:`bdev_stalled_read_warn_lifetime`
869+
and :confval:`bdev_stalled_read_warn_threshold` options with commands similar
870+
to those described in the ``BLOCK_DEVICE_STALLED_READ_ALERT`` warning section.
873871

874872
DB_DEVICE_STALLED_READ_ALERT
875873
____________________________
876874

877-
The warning state ``DB_DEVICE_STALLED_READ_ALERT`` is raised to
878-
indicate ``stalled read`` instances on a given BlueStore OSD's ``DB_DEVICE``.
879-
This warning can be configured via the :confval:`bdev_stalled_read_warn_lifetime` and
880-
:confval:`bdev_stalled_read_warn_threshold` options with commands similar to those
881-
described in the
882-
``BLOCK_DEVICE_STALLED_READ_ALERT`` warning section.
875+
The warning state ``DB_DEVICE_STALLED_READ_ALERT`` is raised to indicate
876+
``stalled read`` instances on a given BlueStore OSD's ``DB_DEVICE``. This
877+
warning can be configured via the :confval:`bdev_stalled_read_warn_lifetime`
878+
and :confval:`bdev_stalled_read_warn_threshold` options with commands similar
879+
to those described in the ``BLOCK_DEVICE_STALLED_READ_ALERT`` warning section.
883880

884881
BLUESTORE_SLOW_OP_ALERT
885882
_______________________
886883

887-
There are BlueStore log messages that reveal storage drive issues
888-
that can lead to performance degradation and data unavailability or loss.
889-
These indicate that the storage drive may be failing and should be investigated
890-
and potentially replaced.
884+
There are BlueStore log messages that reveal storage drive issues that can lead
885+
to performance degradation and data unavailability or loss. These indicate
886+
that the storage drive may be failing and should be investigated and
887+
potentially replaced.
891888

892889
``log_latency_fn slow operation observed for _txc_committed_kv, latency = 12.028621219s, txc = 0x55a107c30f00``
893890
``log_latency_fn slow operation observed for upper_bound, latency = 6.25955s``
@@ -1119,8 +1116,8 @@ LARGE_OMAP_OBJECTS
11191116
__________________
11201117

11211118
One or more pools contain large omap objects, as determined by
1122-
``osd_deep_scrub_large_omap_object_key_threshold`` (threshold for the number of
1123-
keys to determine what is considered a large omap object) or
1119+
``osd_deep_scrub_large_omap_object_key_threshold`` (the threshold for the
1120+
number of keys to determine what is considered a large omap object) or
11241121
``osd_deep_scrub_large_omap_object_value_sum_threshold`` (the threshold for the
11251122
summed size in bytes of all key values to determine what is considered a large
11261123
omap object) or both. To find more information on object name, key count, and
@@ -1140,7 +1137,7 @@ CACHE_POOL_NEAR_FULL
11401137
____________________
11411138

11421139
A cache-tier pool is nearly full, as determined by the ``target_max_bytes`` and
1143-
``target_max_objects`` properties of the cache pool. Once the pool reaches the
1140+
``target_max_objects`` properties of the cache pool. When the pool reaches the
11441141
target threshold, write requests to the pool might block while data is flushed
11451142
and evicted from the cache. This state normally leads to very high latencies
11461143
and poor performance.
@@ -1286,10 +1283,10 @@ For more information, see :ref:`choosing-number-of-placement-groups` and
12861283
POOL_TARGET_SIZE_BYTES_OVERCOMMITTED
12871284
____________________________________
12881285

1289-
One or more pools have a ``target_size_bytes`` property that is set in order to
1290-
estimate the expected size of the pool, but the value(s) of this property are
1291-
greater than the total available storage (either by themselves or in
1292-
combination with other pools).
1286+
One or more pools does have a ``target_size_bytes`` property that is set in
1287+
order to estimate the expected size of the pool, but the value or values of
1288+
this property are greater than the total available storage (either by
1289+
themselves or in combination with other pools).
12931290

12941291
This alert is usually an indication that the ``target_size_bytes`` value for
12951292
the pool is too large and should be reduced or set to zero. To reduce the

0 commit comments

Comments
 (0)