Skip to content

Commit 9f581e1

Browse files
authored
Merge pull request ceph#60882 from anthonyeleven/59466-followup
os/bluestore: Improve documentation introduced by ceph#57722 Reviewed-by: Zac Dover <[email protected]>
2 parents 5a2a3a6 + b6eb98c commit 9f581e1

File tree

2 files changed

+54
-39
lines changed

2 files changed

+54
-39
lines changed

doc/rados/operations/health-checks.rst

Lines changed: 44 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -824,25 +824,27 @@ Or, to disable this alert on a specific OSD, run the following command:
824824
BLOCK_DEVICE_STALLED_READ_ALERT
825825
_______________________________
826826

827-
There are certain BlueStore log messages that surface storage drive issues
827+
There are BlueStore log messages that reveal storage drive issues
828828
that can cause performance degradation and potentially data unavailability or
829-
loss.
829+
loss. These may indicate a storage drive that is failing and should be
830+
evaluated and possibly removed and replaced.
830831

831832
``read stalled read 0x29f40370000~100000 (buffered) since 63410177.290546s, timeout is 5.000000s``
832833

833-
However, this is difficult to spot as there's no discernible warning (a
834+
However, this is difficult to spot because there no discernible warning (a
834835
health warning or info in ``ceph health detail`` for example). More observations
835836
can be found here: https://tracker.ceph.com/issues/62500
836837

837-
As there can be false positive ``stalled read`` instances, a mechanism
838-
has been added for more reliability. If in last ``bdev_stalled_read_warn_lifetime``
839-
duration the number of ``stalled read`` indications are found to be more than or equal to
838+
Also because there can be false positive ``stalled read`` instances, a mechanism
839+
has been added to increase accuracy. If in the last ``bdev_stalled_read_warn_lifetime``
840+
seconds the number of ``stalled read`` events is found to be greater than or equal to
840841
``bdev_stalled_read_warn_threshold`` for a given BlueStore block device, this
841-
warning will be reported in ``ceph health detail``.
842+
warning will be reported in ``ceph health detail``. The warning state will be
843+
removed when the condition clears.
842844

843-
By default value of ``bdev_stalled_read_warn_lifetime = 86400s`` and
844-
``bdev_stalled_read_warn_threshold = 1``. But user can configure it for
845-
individual OSDs.
845+
The defaults for :confval:`bdev_stalled_read_warn_lifetime`
846+
and :confval:`bdev_stalled_read_warn_threshold` may be overridden globally or for
847+
specific OSDs.
846848

847849
To change this, run the following command:
848850

@@ -851,7 +853,8 @@ To change this, run the following command:
851853
ceph config set global bdev_stalled_read_warn_lifetime 10
852854
ceph config set global bdev_stalled_read_warn_threshold 5
853855

854-
this may be done surgically for individual OSDs or a given mask
856+
this may be done for specific OSDs or a given mask. For example,
857+
to apply only to SSD OSDs:
855858

856859
.. prompt:: bash $
857860

@@ -863,40 +866,45 @@ this may be done surgically for individual OSDs or a given mask
863866
WAL_DEVICE_STALLED_READ_ALERT
864867
_____________________________
865868

866-
A similar warning like ``BLOCK_DEVICE_STALLED_READ_ALERT`` will be raised to
867-
identify ``stalled read`` instances on a given BlueStore OSD's ``WAL_DEVICE``.
868-
This warning can be configured via ``bdev_stalled_read_warn_lifetime`` and
869-
``bdev_stalled_read_warn_threshold`` parameters similarly described in the
869+
The warning state ``WAL_DEVICE_STALLED_READ_ALERT`` is raised to
870+
indicate ``stalled read`` instances on a given BlueStore OSD's ``WAL_DEVICE``.
871+
This warning can be configured via the :confval:`bdev_stalled_read_warn_lifetime` and
872+
:confval:`bdev_stalled_read_warn_threshold` options with commands similar to those
873+
described in the
870874
``BLOCK_DEVICE_STALLED_READ_ALERT`` warning section.
871875

872876
DB_DEVICE_STALLED_READ_ALERT
873877
____________________________
874878

875-
A similar warning like ``BLOCK_DEVICE_STALLED_READ_ALERT`` will be raised to
876-
identify ``stalled read`` instances on a given BlueStore OSD's ``WAL_DEVICE``.
877-
This warning can be configured via ``bdev_stalled_read_warn_lifetime`` and
878-
``bdev_stalled_read_warn_threshold`` parameters similarly described in the
879+
The warning state ``DB_DEVICE_STALLED_READ_ALERT`` is raised to
880+
indicate ``stalled read`` instances on a given BlueStore OSD's ``DB_DEVICE``.
881+
This warning can be configured via the :confval:`bdev_stalled_read_warn_lifetime` and
882+
:confval:`bdev_stalled_read_warn_threshold` options with commands similar to those
883+
described in the
879884
``BLOCK_DEVICE_STALLED_READ_ALERT`` warning section.
880885

881886
BLUESTORE_SLOW_OP_ALERT
882887
_______________________
883888

884-
There are certain BlueStore log messages that surface storage drive issues
889+
There are BlueStore log messages that reveal storage drive issues
885890
that can lead to performance degradation and data unavailability or loss.
891+
These indicate that the storage drive may be failing and should be investigated
892+
and potentially replaced.
886893

887894
``log_latency_fn slow operation observed for _txc_committed_kv, latency = 12.028621219s, txc = 0x55a107c30f00``
888895
``log_latency_fn slow operation observed for upper_bound, latency = 6.25955s``
889896
``log_latency slow operation observed for submit_transaction..``
890897

891898
As there can be false positive ``slow ops`` instances, a mechanism has
892-
been added for more reliability. If in last ``bluestore_slow_ops_warn_lifetime``
893-
duration ``slow ops`` indications are found more than or equal to
894-
``bluestore_slow_ops_warn_threshold`` for a given BlueStore OSD, this warning
895-
will be reported in ``ceph health detail``.
899+
been added for more reliability. If in the last ``bluestore_slow_ops_warn_lifetime``
900+
seconds the number of ``slow ops`` indications are found greater than or equal to
901+
:confval:`bluestore_slow_ops_warn_threshold` for a given BlueStore OSD, this
902+
warning will be reported in ``ceph health detail``. The warning state is
903+
cleared when the condition clears.
896904

897-
By default value of ``bluestore_slow_ops_warn_lifetime = 86400s`` and
898-
``bluestore_slow_ops_warn_threshold = 1``. But user can configure it for
899-
individual OSDs.
905+
The defaults for :confval:`bluestore_slow_ops_warn_lifetime` and
906+
:confval:`bluestore_slow_ops_warn_threshold` may be overidden globally or for
907+
specific OSDs.
900908

901909
To change this, run the following command:
902910

@@ -905,7 +913,7 @@ To change this, run the following command:
905913
ceph config set global bluestore_slow_ops_warn_lifetime 10
906914
ceph config set global bluestore_slow_ops_warn_threshold 5
907915

908-
this may be done surgically for individual OSDs or a given mask
916+
this may be done for specific OSDs or a given mask, for example:
909917

910918
.. prompt:: bash $
911919

@@ -931,17 +939,18 @@ the system. Note that this marking ``out`` is normally done automatically if
931939
``mgr/devicehealth/mark_out_threshold``). If an OSD device is compromised but
932940
the OSD(s) on that device are still ``up``, recovery can be degraded. In such
933941
cases it may be advantageous to forcibly stop the OSD daemon(s) in question so
934-
that recovery can proceed from surviving healthly OSDs. This should only be
935-
done with extreme care so that data availability is not compromised.
942+
that recovery can proceed from surviving healthly OSDs. This must be
943+
done with extreme care and attention to failure domains so that data availability
944+
is not compromised.
936945

937946
To check device health, run the following command:
938947

939948
.. prompt:: bash $
940949

941950
ceph device info <device-id>
942951

943-
Device life expectancy is set either by a prediction model that the Manager
944-
runs or by an external tool that is activated by running the following command:
952+
Device life expectancy is set either by a prediction model that the Ceph Manager
953+
runs or by an external tool that runs a command the following form:
945954

946955
.. prompt:: bash $
947956

@@ -1095,7 +1104,7 @@ ____________________
10951104
The count of read repairs has exceeded the config value threshold
10961105
``mon_osd_warn_num_repaired`` (default: ``10``). Because scrub handles errors
10971106
only for data at rest, and because any read error that occurs when another
1098-
replica is available will be repaired immediately so that the client can get
1107+
replica is available is repaired immediately so that the client can get
10991108
the object data, there might exist failing disks that are not registering any
11001109
scrub errors. This repair count is maintained as a way of identifying any such
11011110
failing disks.
@@ -1354,7 +1363,7 @@ data have too many PGs. See *TOO_MANY_PGS* above.
13541363
To silence the health check, raise the threshold by adjusting the
13551364
``mon_pg_warn_max_object_skew`` config option on the managers.
13561365

1357-
The health check will be silenced for a specific pool only if
1366+
The health check is silenced for a specific pool only if
13581367
``pg_autoscale_mode`` is set to ``on``.
13591368

13601369
POOL_APP_NOT_ENABLED
@@ -1489,7 +1498,7 @@ percentage (determined by ``mon_warn_pg_not_scrubbed_ratio``) of the interval
14891498
has elapsed after the time the scrub was scheduled and no scrub has been
14901499
performed.
14911500

1492-
PGs will be scrubbed only if they are flagged as ``clean`` (which means that
1501+
PGs are scrubbed only if they are flagged as ``clean`` (which means that
14931502
they are to be cleaned, and not that they have been examined and found to be
14941503
clean). Misplaced or degraded PGs will not be flagged as ``clean`` (see
14951504
*PG_AVAILABILITY* and *PG_DEGRADED* above).

src/common/options/global.yaml.in

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5485,15 +5485,21 @@ options:
54855485
- name: bluestore_slow_ops_warn_lifetime
54865486
type: uint
54875487
level: advanced
5488-
desc: A configurable duration for slow ops warning to be appeared if number of occurence pass `bluestore_slow_ops_warn_threshold` in `bluestore_slow_ops_warn_lifetime` seconds
5488+
desc: Set the time period during which a BlueStore slow ops warning will be raised when the `bluestore_slow_ops_warn_threshold` is exceeded. This is not the same as `osd_op_complaint_time`, which is about RADOS ops at the OSD level.
54895489
default: 86400
54905490
with_legacy: true
5491+
see_also:
5492+
- bluestore_slow_ops_warn_threshold
5493+
- osd_op_complaint_time
54915494
- name: bluestore_slow_ops_warn_threshold
54925495
type: uint
54935496
level: advanced
5494-
desc: A configurable number for slow ops warning to be appeared if number of occurence pass `bluestore_slow_ops_warn_threshold` in `bluestore_slow_ops_warn_lifetime` seconds
5497+
desc: Set the minimum number of BluesStore slow ops before raising a health warning state
54955498
default: 1
54965499
with_legacy: true
5500+
see_also:
5501+
- bluestore_slow_ops_warn_lifetime
5502+
- osd_op_complaint_time
54975503
- name: bluestore_fsck_error_on_no_per_pool_omap
54985504
type: bool
54995505
level: advanced
@@ -5566,7 +5572,7 @@ options:
55665572
level: dev
55675573
desc: Sets threshold at which shrinking max free chunk size triggers enabling best-fit
55685574
mode.
5569-
long_desc: 'AVL allocator works in two modes: near-fit and best-fit. By default,
5575+
long_desc: 'The AVL allocator works in two modes: near-fit and best-fit. By default,
55705576
it uses very fast near-fit mode, in which it tries to fit a new block near the
55715577
last allocated block of similar size. The second mode is much slower best-fit
55725578
mode, in which it tries to find an exact match for the requested allocation. This
@@ -5586,7 +5592,7 @@ options:
55865592
last allocated block of similar size. The second mode is much slower best-fit
55875593
mode, in which it tries to find an exact match for the requested allocation. This
55885594
mode is used when either the device gets fragmented or when it is low on free
5589-
space. When free space is smaller than ''bluestore_avl_alloc_bf_free_pct'', best-fit
5595+
space. When free space is smaller than `bluestore_avl_alloc_bf_free_pct`, best-fit
55905596
mode is used.'
55915597
default: 4
55925598
see_also:

0 commit comments

Comments
 (0)