Skip to content

Commit 43369eb

Browse files
authored
Merge pull request ceph#57448 from aclamk/wip-aclamk-bs-recompression-segmented-data
os/bluestore: Recompression, part 3. Segmented onode.
2 parents 09e8bd4 + 7472911 commit 43369eb

File tree

10 files changed

+252
-57
lines changed

10 files changed

+252
-57
lines changed
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
overrides:
2+
ceph:
3+
conf:
4+
osd:
5+
bluestore onode segment size: 1024K
6+
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
overrides:
2+
ceph:
3+
conf:
4+
osd:
5+
bluestore onode segment size: 256K
6+
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
overrides:
2+
ceph:
3+
conf:
4+
osd:
5+
bluestore onode segment size: 512K
6+
bluestore debug onode segmentation random: true
7+
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
overrides:
2+
ceph:
3+
conf:
4+
osd:
5+
bluestore onode segment size: 512K
6+
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
overrides:
2+
ceph:
3+
conf:
4+
osd:
5+
bluestore onode segment size: 0
6+

src/common/options/global.yaml.in

Lines changed: 32 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5112,7 +5112,7 @@ options:
51125112
type: bool
51135113
level: advanced
51145114
desc: Random selection of write path mode
5115-
long_desc: For testing purposes. If true, value of bluestore_write_v2 is randomly selected.
5115+
long_desc: For testing purposes. If true, value of bluestore_write_v2 is randomly selected on each mount.
51165116
default: false
51175117
see_also:
51185118
- bluestore_write_v2
@@ -6697,6 +6697,37 @@ options:
66976697
desc: How long cleaner should sleep before re-checking utilization
66986698
default: 5
66996699
with_legacy: true
6700+
- name: bluestore_onode_segment_size
6701+
type: size
6702+
level: advanced
6703+
desc: Size of segment for onode.
6704+
long_desc: When object size grows too large BlueStore splits allocation metadata into
6705+
smaller RocksDB keys (shards). When multiple blobs overlap each other
6706+
some of them might belong to more than one shard. The encoding for such case
6707+
is inefficient (spanning blobs). Segmentation of data prevents blobs from crossing
6708+
specific separation lines, preventing spanning blobs altogether.
6709+
The smaller values give better split on onode shards.
6710+
The larger values minimize space loss for padding in compression.
6711+
Recommended values 256K, 512K, 1024K. Value 0 disables segmentation.
6712+
Actual segment size cannot be smaller than "compression_max_blob_size" pool option, if set.
6713+
default: 0
6714+
see_also:
6715+
- bluestore_extent_map_shard_max_size
6716+
- bluestore_extent_map_shard_target_size
6717+
- bluestore_debug_onode_segmentation_random
6718+
with_legacy: false
6719+
- name: bluestore_debug_onode_segmentation_random
6720+
type: bool
6721+
level: dev
6722+
desc: Random selection of onode segmentation
6723+
long_desc: For testing purposes. On each mount 50% roll decides whether to use
6724+
bluestore_onode_segment_size or set it to 0 (disable).
6725+
default: false
6726+
see_also:
6727+
- bluestore_onode_segment_size
6728+
flags:
6729+
- startup
6730+
with_legacy: false
67006731
- name: jaeger_tracing_enable
67016732
type: bool
67026733
level: advanced

0 commit comments

Comments
 (0)