Skip to content

Commit 790afd2

Browse files
nh2tchaikov
authored andcommitted
doc: Document which options are disabled by mClock.
Not only in the mClock docs, but also in the reference of the options that are disabled. Otherwise users are bound to miss it, and surprised why their options are ignored or reset. Signed-off-by: Niklas Hambüchen <[email protected]>
1 parent 2505e27 commit 790afd2

File tree

3 files changed

+38
-4
lines changed

3 files changed

+38
-4
lines changed

doc/rados/configuration/osd-config-ref.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -373,6 +373,8 @@ considerably. To maintain operational performance, Ceph performs this migration
373373
with 'backfilling', which allows Ceph to set backfill operations to a lower
374374
priority than requests to read or write data.
375375

376+
.. note:: Some of these settings are automatically reset if the `mClock`_
377+
scheduler is active, see `mClock backfill`_.
376378

377379
.. confval:: osd_max_backfills
378380
.. confval:: osd_backfill_scan_min
@@ -415,6 +417,9 @@ To maintain operational performance, Ceph performs recovery with limitations on
415417
the number recovery requests, threads and object chunk sizes which allows Ceph
416418
perform well in a degraded state.
417419

420+
.. note:: Some of these settings are automatically reset if the `mClock`_
421+
scheduler is active, see `mClock backfill`_.
422+
418423
.. confval:: osd_recovery_delay_start
419424
.. confval:: osd_recovery_max_active
420425
.. confval:: osd_recovery_max_active_hdd
@@ -452,6 +457,8 @@ Miscellaneous
452457
.. _pool: ../../operations/pools
453458
.. _Configuring Monitor/OSD Interaction: ../mon-osd-interaction
454459
.. _Monitoring OSDs and PGs: ../../operations/monitoring-osd-pg#peering
460+
.. _mClock: ../mclock-config-ref.rst
461+
.. _mClock backfill: ../mclock-config-ref.rst#recovery-backfill-options
455462
.. _Pool & PG Config Reference: ../pool-pg-config-ref
456463
.. _Journal Config Reference: ../journal-ref
457464
.. _cache target dirty high ratio: ../../operations/pools#cache-target-dirty-high-ratio

doc/rados/operations/monitoring-osd-pg.rst

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -419,7 +419,10 @@ conditions change.
419419
Ceph provides a number of settings to manage the load spike associated with the
420420
reassignment of PGs to an OSD (especially a new OSD). The ``osd_max_backfills``
421421
setting specifies the maximum number of concurrent backfills to and from an OSD
422-
(default: 1). The ``backfill_full_ratio`` setting allows an OSD to refuse a
422+
(default: 1; note you cannot change this if the `mClock`_ scheduler is active,
423+
unless you set ``osd_mclock_override_recovery_settings = true``, see
424+
`mClock backfill`_).
425+
The ``backfill_full_ratio`` setting allows an OSD to refuse a
423426
backfill request if the OSD is approaching its full ratio (default: 90%). This
424427
setting can be changed with the ``ceph osd set-backfillfull-ratio`` command. If
425428
an OSD refuses a backfill request, the ``osd_backfill_retry_interval`` setting
@@ -545,6 +548,8 @@ performing the migration. For details, see the `Architecture`_ section.
545548
.. _data placement: ../data-placement
546549
.. _pool: ../pools
547550
.. _placement group: ../placement-groups
551+
.. _mClock: ../../configuration/mclock-config-ref.rst
552+
.. _mClock backfill: ../../configuration/mclock-config-ref.rst#recovery-backfill-options
548553
.. _Architecture: ../../../architecture
549554
.. _OSD Not Running: ../../troubleshooting/troubleshooting-osd#osd-not-running
550555
.. _Troubleshooting PG Errors: ../../troubleshooting/troubleshooting-pg#troubleshooting-pg-errors

src/common/options/osd.yaml.in

Lines changed: 25 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,10 @@ options:
5858
in recovery and 1 shard of another recovering PG.
5959
fmt_desc: The maximum number of backfills allowed to or from a single OSD.
6060
Note that this is applied separately for read and write operations.
61+
This setting is automatically reset when the mClock scheduler is used.
6162
default: 1
63+
see_also:
64+
- osd_mclock_override_recovery_settings
6265
flags:
6366
- runtime
6467
with_legacy: true
@@ -95,6 +98,7 @@ options:
9598
fmt_desc: Time in seconds to sleep before the next recovery or backfill op.
9699
Increasing this value will slow down recovery operation while
97100
client operations will be less impacted.
101+
note: This setting is ignored when the mClock scheduler is used.
98102
default: 0
99103
flags:
100104
- runtime
@@ -105,6 +109,7 @@ options:
105109
desc: Time in seconds to sleep before next recovery or backfill op for HDDs
106110
fmt_desc: Time in seconds to sleep before next recovery or backfill op
107111
for HDDs.
112+
note: This setting is ignored when the mClock scheduler is used.
108113
default: 0.1
109114
flags:
110115
- runtime
@@ -115,6 +120,7 @@ options:
115120
desc: Time in seconds to sleep before next recovery or backfill op for SSDs
116121
fmt_desc: Time in seconds to sleep before the next recovery or backfill op
117122
for SSDs.
123+
note: This setting is ignored when the mClock scheduler is used.
118124
default: 0
119125
see_also:
120126
- osd_recovery_sleep
@@ -128,6 +134,7 @@ options:
128134
on HDD and journal is on SSD
129135
fmt_desc: Time in seconds to sleep before the next recovery or backfill op
130136
when OSD data is on HDD and OSD journal / WAL+DB is on SSD.
137+
note: This setting is ignored when the mClock scheduler is used.
131138
default: 0.025
132139
see_also:
133140
- osd_recovery_sleep
@@ -141,6 +148,7 @@ options:
141148
fmt_desc: Time in seconds to sleep before next snap trim op.
142149
Increasing this value will slow down snap trimming.
143150
This option overrides backend specific variants.
151+
note: This setting is ignored when the mClock scheduler is used.
144152
default: 0
145153
flags:
146154
- runtime
@@ -149,6 +157,7 @@ options:
149157
type: float
150158
level: advanced
151159
desc: Time in seconds to sleep before next snap trim for HDDs
160+
note: This setting is ignored when the mClock scheduler is used.
152161
default: 5
153162
flags:
154163
- runtime
@@ -158,6 +167,7 @@ options:
158167
desc: Time in seconds to sleep before next snap trim for SSDs
159168
fmt_desc: Time in seconds to sleep before next snap trim op
160169
for SSD OSDs (including NVMe).
170+
note: This setting is ignored when the mClock scheduler is used.
161171
default: 0
162172
flags:
163173
- runtime
@@ -168,6 +178,7 @@ options:
168178
is on SSD
169179
fmt_desc: Time in seconds to sleep before next snap trim op
170180
when OSD data is on an HDD and the OSD journal or WAL+DB is on an SSD.
181+
note: This setting is ignored when the mClock scheduler is used.
171182
default: 2
172183
flags:
173184
- runtime
@@ -182,6 +193,7 @@ options:
182193
desc: Maximum concurrent scrubs on a single OSD
183194
fmt_desc: The maximum number of simultaneous scrub operations for
184195
a Ceph OSD Daemon.
196+
note: This setting is ignored when the mClock scheduler is used.
185197
default: 3
186198
with_legacy: true
187199
- name: osd_scrub_during_recovery
@@ -377,7 +389,7 @@ options:
377389
fmt_desc: Sleep time in seconds before scrubbing the next group of objects (the next chunk).
378390
Increasing this value will slow down the overall rate of scrubbing, reducing scrub
379391
impact on client operations.
380-
This setting is ignored when the mClock scheduler is used.
392+
note: This setting is ignored when the mClock scheduler is used.
381393
default: 0
382394
flags:
383395
- runtime
@@ -392,7 +404,7 @@ options:
392404
This configuration value is used for scrubbing out of scrubbing hours.
393405
Increasing this value will slow down the overall rate of scrubbing, reducing scrub
394406
impact on client operations.
395-
This setting is ignored when the mClock scheduler is used.
407+
note: This setting is ignored when the mClock scheduler is used.
396408
default: 0
397409
see_also:
398410
- osd_scrub_begin_hour
@@ -1336,10 +1348,12 @@ options:
13361348
is ``0``, which means that the ``hdd`` or ``ssd`` values
13371349
(below) are used, depending on the type of the primary
13381350
device backing the OSD.
1351+
This setting is automatically reset when the mClock scheduler is used.
13391352
default: 0
13401353
see_also:
13411354
- osd_recovery_max_active_hdd
13421355
- osd_recovery_max_active_ssd
1356+
- osd_mclock_override_recovery_settings
13431357
flags:
13441358
- runtime
13451359
with_legacy: true
@@ -1350,10 +1364,12 @@ options:
13501364
devices)
13511365
fmt_desc: The number of active recovery requests per OSD at one time, if the
13521366
primary device is rotational.
1367+
note: This setting is automatically reset when the mClock scheduler is used.
13531368
default: 3
13541369
see_also:
13551370
- osd_recovery_max_active
13561371
- osd_recovery_max_active_ssd
1372+
- osd_mclock_override_recovery_settings
13571373
flags:
13581374
- runtime
13591375
with_legacy: true
@@ -1364,10 +1380,12 @@ options:
13641380
solid state devices)
13651381
fmt_desc: The number of active recovery requests per OSD at one time, if the
13661382
primary device is non-rotational (i.e., an SSD).
1383+
note: This setting is automatically reset when the mClock scheduler is used.
13671384
default: 10
13681385
see_also:
13691386
- osd_recovery_max_active
13701387
- osd_recovery_max_active_hdd
1388+
- osd_mclock_override_recovery_settings
13711389
flags:
13721390
- runtime
13731391
with_legacy: true
@@ -1462,20 +1480,23 @@ options:
14621480
overrides _ssd, _hdd, and _hybrid if non-zero.
14631481
fmt_desc: Time in seconds to sleep before the next removal transaction. This
14641482
throttles the PG deletion process.
1483+
note: This setting is ignored when the mClock scheduler is used.
14651484
default: 0
14661485
flags:
14671486
- runtime
14681487
- name: osd_delete_sleep_hdd
14691488
type: float
14701489
level: advanced
1471-
desc: Time in seconds to sleep before next removal transaction for HDDs
1490+
desc: Time in seconds to sleep before next removal transaction for HDDs.
1491+
note: This setting is ignored when the mClock scheduler is used.
14721492
default: 5
14731493
flags:
14741494
- runtime
14751495
- name: osd_delete_sleep_ssd
14761496
type: float
14771497
level: advanced
14781498
desc: Time in seconds to sleep before next removal transaction for SSDs
1499+
note: This setting is ignored when the mClock scheduler is used.
14791500
default: 1
14801501
flags:
14811502
- runtime
@@ -1484,6 +1505,7 @@ options:
14841505
level: advanced
14851506
desc: Time in seconds to sleep before next removal transaction when OSD data is on HDD
14861507
and OSD journal or WAL+DB is on SSD
1508+
note: This setting is ignored when the mClock scheduler is used.
14871509
default: 1
14881510
flags:
14891511
- runtime

0 commit comments

Comments
 (0)