Skip to content

Commit 870ad4d

Browse files
authored
Merge pull request ceph#63848 from zdover23/wip-doc-2025-06-10-backport-63836-to-tentacle
tentacle: doc/rados/operations: Address suggestions for stretch-mode.rst Reviewed-by: Anthony D'Atri <[email protected]>
2 parents 6ec061f + 4994dc6 commit 870ad4d

File tree

1 file changed

+32
-33
lines changed

1 file changed

+32
-33
lines changed

doc/rados/operations/stretch-mode.rst

Lines changed: 32 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,9 @@ one-third to one-half of the total cluster).
1717

1818
Ceph is designed with the expectation that all parts of its network and cluster
1919
will be reliable and that failures will be distributed randomly across the
20-
CRUSH map. Even if a switch goes down and causes the loss of many OSDs, Ceph is
21-
designed so that the remaining OSDs and monitors will route around such a loss.
20+
CRUSH topology. When a host or network switch goes down, many OSDs will
21+
become unavailable. Ceph is designed so that the remaining OSDs and
22+
Monitors will maintain access to data.
2223

2324
Sometimes this cannot be relied upon. If you have a "stretched-cluster"
2425
deployment in which much of your cluster is behind a single network component,
@@ -28,13 +29,13 @@ We will here consider two standard configurations: a configuration with two
2829
data centers (or, in clouds, two availability zones), and a configuration with
2930
three data centers (or, in clouds, three availability zones).
3031

31-
In the two-site configuration, Ceph expects each of the sites to hold a copy of
32-
the data, and Ceph also expects there to be a third site that has a tiebreaker
33-
monitor. This tiebreaker monitor picks a winner if the network connection fails
34-
and both data centers remain alive.
32+
In the two-site configuration, Ceph arranges for each site to hold a copy of
33+
the data. A third site houses a tiebreaker (arbiter, witness)
34+
Monitor. This tiebreaker Monitor picks a winner when a network connection
35+
between sites fails and both data centers remain alive.
3536

36-
The tiebreaker monitor can be a VM. It can also have high latency relative to
37-
the two main sites.
37+
The tiebreaker monitor can be a VM. It can also have higher network latency
38+
to the OSD site(s) than OSD site(s) can have to each other.
3839

3940
The standard Ceph configuration is able to survive MANY network failures or
4041
data-center failures without ever compromising data availability. If enough
@@ -56,8 +57,8 @@ without operator intervention.
5657

5758
Ceph does not permit the compromise of data integrity or data consistency, but
5859
there are situations in which *data availability* is compromised. These
59-
situations can occur even though there are enough clusters available to satisfy
60-
Ceph's consistency and sizing constraints. In some situations, you might
60+
situations can occur even though there are sufficient replicas of data available to satisfy
61+
consistency and sizing constraints. In some situations, you might
6162
discover that your cluster does not satisfy those constraints.
6263

6364
The first category of these failures that we will discuss involves inconsistent
@@ -83,10 +84,9 @@ situation is surprisingly difficult to avoid using only standard CRUSH rules.
8384

8485
Individual Stretch Pools
8586
========================
86-
Setting individual ``stretch pool`` is an option that allows for the configuration
87-
of specific pools to be distributed across ``two or more data centers``.
88-
This is achieved by executing the ``ceph osd pool stretch set`` command on each desired pool,
89-
as opposed to applying a cluster-wide configuration ``with stretch mode``.
87+
Setting individual ``stretch pool`` attributes allows for
88+
specific pools to be distributed across two or more data centers.
89+
This is done by executing the ``ceph osd pool stretch set`` command on each desired pool.
9090
See :ref:`setting_values_for_a_stretch_pool`
9191

9292
Use ``stretch mode`` when you have exactly ``two data centers`` and require a uniform
@@ -181,8 +181,8 @@ your CRUSH map. This procedure shows how to do this.
181181
step emit
182182
}
183183

184-
.. warning:: If a CRUSH rule is defined for a stretch mode cluster and the
185-
rule has multiple "takes" in it, then ``MAX AVAIL`` for the pools
184+
.. warning:: When a CRUSH rule is defined in a stretch mode cluster and the
185+
rule has multiple ``take`` steps, ``MAX AVAIL`` for the pools
186186
associated with the CRUSH rule will report that the available size is all
187187
of the available space from the datacenter, not the available space for
188188
the pools associated with the CRUSH rule.
@@ -258,12 +258,12 @@ your CRUSH map. This procedure shows how to do this.
258258
ceph mon enable_stretch_mode e stretch_rule datacenter
259259

260260
When stretch mode is enabled, PGs will become active only when they peer
261-
across data centers (or across whichever CRUSH bucket type was specified),
262-
assuming both are alive. Pools will increase in size from the default ``3`` to
263-
``4``, and two copies will be expected in each site. OSDs will be allowed to
264-
connect to monitors only if they are in the same data center as the monitors.
265-
New monitors will not be allowed to join the cluster if they do not specify a
266-
location.
261+
across CRUSH ``datacenter``s (or across whichever CRUSH bucket type was specified),
262+
assuming both are available. Pools will increase in size from the default ``3`` to
263+
``4``, and two replicas will be placed at each site. OSDs will be allowed to
264+
connect to Monitors only if they are in the same data center as the Monitors.
265+
New Monitors will not be allowed to join the cluster if they do not specify a
266+
CRUSH location.
267267

268268
If all OSDs and monitors in one of the data centers become inaccessible at once,
269269
the surviving data center enters a "degraded stretch mode". A warning will be
@@ -297,22 +297,21 @@ To exit stretch mode, run the following command:
297297

298298
.. describe:: {crush_rule}
299299

300-
The CRUSH rule that the user wants all pools to move back to. If this
301-
is not specified, the pools will move back to the default CRUSH rule.
300+
The non-stretch CRUSH rule to use for all pools. If this
301+
is not specified, the pools will move to the default CRUSH rule.
302302

303303
:Type: String
304304
:Required: No.
305305

306-
The command will move the cluster back to normal mode,
307-
and the cluster will no longer be in stretch mode.
308-
All pools will move its ``size`` and ``min_size``
309-
back to the default values it started with.
310-
At this point the user is responsible for scaling down the cluster
311-
to the desired number of OSDs if they choose to operate with less number of OSDs.
306+
This command moves the cluster back to normal mode;
307+
the cluster will no longer be in stretch mode.
308+
All pools will be set with their prior ``size`` and ``min_size``
309+
values. At this point the user is responsible for scaling down the cluster
310+
to the desired number of OSDs if they choose to operate with fewer OSDs.
312311

313-
Please note that the command will not execute when the cluster is in
314-
``recovery stretch mode``. The command will only execute when the cluster
315-
is in ``degraded stretch mode`` or ``healthy stretch mode``.
312+
Note that the command will not execute when the cluster is in
313+
recovery stretch mode. The command executes only when the cluster
314+
is in degraded stretch mode or healthy stretch mode.
316315

317316
Limitations of Stretch Mode
318317
===========================

0 commit comments

Comments
 (0)