Skip to content

Commit b0812b5

Browse files
Merge pull request ceph#64809 from bill-scales/ec_docs
Doc: Erasure Coding enhancements for tentacle
2 parents 563ac16 + 185987a commit b0812b5

File tree

5 files changed

+74
-1
lines changed

5 files changed

+74
-1
lines changed

doc/cephfs/createfs.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,5 +137,11 @@ You may use Erasure Coded pools as CephFS data pools as long as they have overwr
137137
138138
Note that EC overwrites are only supported when using OSDs with the BlueStore backend.
139139

140+
If you are storing lots of small files or are frequently modifying files you can improve performance by enabling EC optimizations, which is done as follows:
141+
142+
.. code:: bash
143+
144+
ceph osd pool set my_ec_pool allow_ec_optimizations true
145+
140146
You may not use Erasure Coded pools as CephFS metadata pools, because CephFS metadata is stored using RADOS *OMAP* data structures, which EC pools cannot store.
141147

doc/rados/configuration/pool-pg-config-ref.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@ See :ref:`pg-autoscaler`.
6969
.. confval:: osd_max_pg_log_entries
7070
.. confval:: osd_default_data_pool_replay_window
7171
.. confval:: osd_max_pg_per_osd_hard_ratio
72+
.. confval:: osd_pool_default_flag_ec_optimizations
7273

7374
.. _pool: ../../operations/pools
7475
.. _Monitoring OSDs and PGs: ../../operations/monitoring-osd-pg#peering

doc/rados/operations/erasure-code-profile.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,9 @@ Where:
8080
``osd_pool_erasure_code_stripe_unit`` when a pool is
8181
created. The stripe_width of a pool using this profile
8282
will be the number of data chunks multiplied by this
83-
stripe_unit.
83+
stripe_unit. See `Erasure Coding Optimizations`_ for
84+
more information.
85+
8486

8587
:Type: String
8688
:Required: No.

doc/rados/operations/erasure-code.rst

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -206,6 +206,62 @@ erasure-coded pool as the ``--data-pool`` during image creation:
206206
For CephFS, an erasure-coded pool can be set as the default data pool during
207207
file system creation or via :ref:`file-layouts`.
208208

209+
Erasure Coding Optimizations
210+
----------------------------
211+
212+
Since Tentacle, an erasure-coded pool may have optimizations enabled
213+
with a per-pool setting. This improves performance for smaller I/Os and
214+
eliminates padding which can save capacity:
215+
216+
.. prompt:: bash $
217+
218+
ceph osd pool set ec_pool allow_ec_optimizations true
219+
220+
The optimizations will make an erasure code pool more suitable for use
221+
with RBD or CephFS. For RGW workloads that have large objects that are read and
222+
written sequentially there will be little benefit from these optimizations; but
223+
RGW workloads with lots of very small objects or small random access reads will
224+
see performance and capacity benefits.
225+
226+
This flag may be enabled for existing pools, and can be configured
227+
to default for new pools using the central configuration option
228+
:confval:`osd_pool_default_flag_ec_optimizations`. Once the flag has been
229+
enabled for a pool it cannot be disabled because it changes how new data is
230+
stored.
231+
232+
The flag cannot be set unless all the Monitors and OSDs have been
233+
upgraded to Tentacle or later. Optimizations can be enabled and used without
234+
upgrading gateways and clients.
235+
236+
Optimizations are currently only supported with the Jerasure and ISA-L plugins
237+
when using the ``reed_sol_van`` technique (these are the old and current
238+
defaults and are the most widely used plugins and technique). Attempting to
239+
set the flag for a pool using an unsupported combination of plugin and
240+
technique is blocked with an error message.
241+
242+
The default stripe unit is 4K which works well for standard EC pools.
243+
For the majority of I/O workloads it is recommended to increase the stripe
244+
unit to at least 16K when using optimizations. Performance testing
245+
shows that 16K is the best choice for general purpose I/O workloads. Increasing
246+
this value will significantly improve small read performance but will slightly
247+
reduce the performance of small sequential writes. For I/O workloads that are
248+
predominately reads, larger values up to 256KB will further improve read
249+
performance but will further reduce the performance of small sequential writes.
250+
Values larger than 256KB are unlikely to have any performance benefit. The
251+
stripe unit is a pool create-time option that can be set in the erasure code
252+
profile or by setting the central configuration option
253+
:confval:`osd_pool_erasure_code_stripe_unit`. The stripe unit cannot be changed
254+
after the pool has been created, so if enabling optimizations for an existing
255+
pool you will not get the full benefit of the optimizations.
256+
257+
Without optimizations enabled, the choice of ``k+m`` in the erasure code profile
258+
affects performance. The higher the values of ``k`` and ``m`` the lower the
259+
performance will be. With optimizations enabled there is only a very slight
260+
reduction in performance as ``k`` increases so this makes using a higher value
261+
of ``k`` more viable. Increasing ``m`` still impacts write performance,
262+
especially for small writes, so for block and file workloads a value of ``m``
263+
no larger than 3 is recommended.
264+
209265
Erasure-coded pool overhead
210266
---------------------------
211267

doc/rados/operations/pools.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -451,7 +451,14 @@ You may set values for the following keys:
451451
:Type: Boolean
452452

453453
.. versionadded:: 12.2.0
454+
455+
.. describe:: allow_ec_optimizations
454456

457+
:Description: Enables performance and capacity optimizations for an erasure-coded pool. These optimizations were designed for CephFS and RBD workloads; RGW workloads with signficant numbers of small objects or with small random access reads of objects will also benefit. RGW workloads with large sequential read and writes will see little benefit. For more details, see `Erasure Coding Optimizations`_.
458+
:Type: Boolean
459+
460+
.. versionadded:: 20.2.0
461+
455462
.. describe:: hashpspool
456463

457464
:Description: Sets or unsets the ``HASHPSPOOL`` flag on a given pool.
@@ -900,6 +907,7 @@ Here are the break downs of the argument:
900907

901908
.. _Bloom Filter: https://en.wikipedia.org/wiki/Bloom_filter
902909
.. _Erasure Coding with Overwrites: ../erasure-code#erasure-coding-with-overwrites
910+
.. _Erasure Coding Optimizations: ../erasure-code#erasure-coding-optimizations
903911
.. _Block Device Commands: ../../../rbd/rados-rbd-cmds/#create-a-block-device-pool
904912
.. _pgcalc: ../pgcalc
905913

0 commit comments

Comments
 (0)