Skip to content

Commit 64d61b7

Browse files
authored
Merge pull request ceph#58109 from zdover23/wip-doc-2024-06-18-rados-ops-stretch-mode
doc/rados: add stretch_rule workaround Reviewed-by: Prashant D <[email protected]>
2 parents 0bda735 + 007385a commit 64d61b7

File tree

1 file changed

+51
-0
lines changed

1 file changed

+51
-0
lines changed

doc/rados/operations/stretch-mode.rst

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,57 @@ your CRUSH map. This procedure shows how to do this.
130130
step emit
131131
}
132132

133+
.. warning:: If a CRUSH rule is defined for a stretch mode cluster and the
134+
rule has multiple "takes" in it, then ``MAX AVAIL`` for the pools
135+
associated with the CRUSH rule will report that the available size is all
136+
of the available space from the datacenter, not the available space for
137+
the pools associated with the CRUSH rule.
138+
139+
For example, consider a cluster with two CRUSH rules, ``stretch_rule`` and
140+
``stretch_replicated_rule``::
141+
142+
rule stretch_rule {
143+
id 1
144+
type replicated
145+
step take DC1
146+
step chooseleaf firstn 2 type host
147+
step emit
148+
step take DC2
149+
step chooseleaf firstn 2 type host
150+
step emit
151+
}
152+
153+
rule stretch_replicated_rule {
154+
id 2
155+
type replicated
156+
step take default
157+
step choose firstn 0 type datacenter
158+
step chooseleaf firstn 2 type host
159+
step emit
160+
}
161+
162+
In the above example, ``stretch_rule`` will report an incorrect value for
163+
``MAX AVAIL``. ``stretch_replicated_rule`` will report the correct value.
164+
This is because ``stretch_rule`` is defined in such a way that
165+
``PGMap::get_rule_avail`` considers only the available size of a single
166+
data center, and not (as would be correct) the total available size from
167+
both datacenters.
168+
169+
Here is a workaround. Instead of defining the stretch rule as defined in
170+
the ``stretch_rule`` function above, define it as follows::
171+
172+
rule stretch_rule {
173+
id 2
174+
type replicated
175+
step take default
176+
step choose firstn 0 type datacenter
177+
step chooseleaf firstn 2 type host
178+
step emit
179+
}
180+
181+
See https://tracker.ceph.com/issues/56650 for more detail on this workaround.
182+
183+
133184
#. Inject the CRUSH map to make the rule available to the cluster:
134185

135186
.. prompt:: bash $

0 commit comments

Comments
 (0)