Skip to content

Commit e93e155

Browse files
authored
Merge pull request ceph#61254 from kamoltat/wip-ksirivad-fix-stretch-mode-doc
doc/rados/operations/stretch-mode: Improve doc Reviewed-by: zdover23 Reviewed-by: anthonyeleven
2 parents 7c9338d + 8cc7fdb commit e93e155

File tree

1 file changed

+23
-7
lines changed

1 file changed

+23
-7
lines changed

doc/rados/operations/stretch-mode.rst

Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -119,13 +119,29 @@ See https://tracker.ceph.com/issues/68338 for more information.
119119

120120
Stretch Mode
121121
============
122-
Stretch mode is designed to handle deployments in which you cannot guarantee the
123-
replication of data across two data centers. This kind of situation can arise
124-
when the cluster's CRUSH rule specifies that three copies are to be made, but
125-
then a copy is placed in each data center with a ``min_size`` of 2. Under such
126-
conditions, a placement group can become active with two copies in the first
127-
data center and no copies in the second data center.
128122

123+
Stretch mode is designed to handle netsplit scenarios between two data zones as well
124+
as the loss of one data zone. It handles the netsplit scenario by choosing the surviving zone
125+
that has the better connection to the ``tiebreaker monitor``. It handles the loss of one zone by
126+
reducing the ``size`` to ``2`` and ``min_size`` to ``1``, allowing the cluster to continue operating
127+
with the remaining zone. When the lost zone comes back, the cluster will recover the lost data
128+
and return to normal operation.
129+
130+
Connectivity Monitor Election Strategy
131+
---------------------------------------
132+
When using stretch mode, the monitor election strategy must be set to ``connectivity``.
133+
This strategy tracks network connectivity between the monitors and is
134+
used to determine which zone should be favored when the cluster is in a netsplit scenario.
135+
136+
See `Changing Monitor Elections`_
137+
138+
Stretch Peering Rule
139+
--------------------
140+
One critical behavior of stretch mode is its ability to prevent a PG from going active if the acting set
141+
contains only replicas from a single zone. This safeguard is crucial for mitigating the risk of data
142+
loss during site failures because if a PG were allowed to go active with replicas only in a single site,
143+
writes could be acknowledged despite a lack of redundancy. In the event of a site failure, all data in the
144+
affected PG would be lost.
129145

130146
Entering Stretch Mode
131147
---------------------
@@ -271,7 +287,7 @@ possible, if needed).
271287
.. _Changing Monitor elections: ../change-mon-elections
272288

273289
Exiting Stretch Mode
274-
=====================
290+
--------------------
275291
To exit stretch mode, run the following command:
276292

277293
.. prompt:: bash $

0 commit comments

Comments
 (0)