Skip to content

Commit 3040a6d

Browse files
authored
Merge pull request ceph#62367 from anthonyeleven/pg-autoscale-upgrade
doc/cephadm: Add PG autoscaler advice to upgrade.rst Reviewed-by: Zac Dover <[email protected]>
2 parents d3916b5 + ee69f52 commit 3040a6d

File tree

1 file changed

+32
-10
lines changed

1 file changed

+32
-10
lines changed

doc/cephadm/upgrade.rst

Lines changed: 32 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -19,36 +19,56 @@ The automated upgrade process follows Ceph best practices. For example:
1919

2020
.. note::
2121

22-
In case a host of the cluster is offline, the upgrade is paused.
22+
If a cluster host is or becomes unavailable the upgrade will be paused
23+
until it is restored.
24+
25+
.. note::
26+
27+
When the PG autoscaler mode for **any** pool is set to ``on``, we recommend
28+
disabling the autoscaler for the duration of the upgrade. This is so that
29+
PG splitting or merging in the middle of an upgrade does not unduly delay
30+
upgrade progress. In a very large cluster this could easily increase the
31+
time to complete by a whole day or more, especially if the upgrade happens to
32+
change PG autoscaler behavior by e.g. changing the default value of
33+
the :confval:`mon_target_pg_per_osd`.
34+
|
35+
* ``ceph osd pool set noautoscale``
36+
* Perform the upgrade
37+
* ``ceph osd pool unset noautoscale``
38+
|
39+
When pausing autoscaler activity in this fashion, the existing values for
40+
each pool's mode, ``off``, ``on``, or ``warn`` are expected to remain.
41+
If the new release changes the above target value, there may be splitting
42+
or merging of PGs when unsetting after the upgrade.
2343

2444

2545
Starting the upgrade
2646
====================
2747

2848
.. note::
2949
.. note::
30-
`Staggered Upgrade`_ of the mons/mgrs may be necessary to have access
31-
to this new feature.
50+
`Staggered Upgrade`_ of the Monitors and Managers may be necessary to use
51+
the below CephFS upgrade feature.
3252

33-
Cephadm by default reduces `max_mds` to `1`. This can be disruptive for large
53+
Cephadm by default reduces ``max_mds`` to ``1``. This can be disruptive for large
3454
scale CephFS deployments because the cluster cannot quickly reduce active MDS(s)
3555
to `1` and a single active MDS cannot easily handle the load of all clients
36-
even for a short time. Therefore, to upgrade MDS(s) without reducing `max_mds`,
37-
the `fail_fs` option can to be set to `true` (default value is `false`) prior
56+
even for a short time. Therefore, to upgrade MDS(s) without reducing ``max_mds``,
57+
the ``fail_fs`` option can to be set to ``true`` (default value is ``false``) prior
3858
to initiating the upgrade:
3959

4060
.. prompt:: bash #
4161

42-
ceph config set mgr mgr/orchestrator/fail_fs true
62+
ceph confg set mgr mgr/orchestrator/fail_fs true
4363

4464
This would:
4565
#. Fail CephFS filesystems, bringing active MDS daemon(s) to
46-
`up:standby` state.
66+
``up:standby`` state.
4767

4868
#. Upgrade MDS daemons safely.
4969

5070
#. Bring CephFS filesystems back up, bringing the state of active
51-
MDS daemon(s) from `up:standby` to `up:active`.
71+
MDS daemon(s) from ``up:standby`` to `up:active``.
5272

5373
Before you use cephadm to upgrade Ceph, verify that all hosts are currently online and that your cluster is healthy by running the following command:
5474

@@ -145,7 +165,9 @@ The message ``Error ENOENT: Module not found`` appears in response to the comman
145165

146166
Error ENOENT: Module not found
147167

148-
This is possibly caused by invalid JSON in a mgr config-key. See `Redmine tracker Issue #67329 <https://tracker.ceph.com/issues/67329>`_ and `the discussion on the [ceph-users] mailing list <https://www.spinics.net/lists/ceph-users/msg83667.html>`_.
168+
This is possibly caused by invalid JSON in a mgr config-key.
169+
See `Redmine tracker Issue #67329 <https://tracker.ceph.com/issues/67329>`_
170+
and `this discussion on the ceph-users mailing list <https://www.spinics.net/lists/ceph-users/msg83667.html>`_.
149171

150172
UPGRADE_NO_STANDBY_MGR
151173
----------------------

0 commit comments

Comments
 (0)