Skip to content

Commit 82187d2

Browse files
naotakdave
authored andcommitted
btrfs: zoned: fix chunk allocation condition for zoned allocator
The ZNS specification defines a limit on the number of "active" zones. That limit impose us to limit the number of block groups which can be used for an allocation at the same time. Not to exceed the limit, we reuse the existing active block groups as much as possible when we can't activate any other zones without sacrificing an already activated block group in commit a85f05e ("btrfs: zoned: avoid chunk allocation if active block group has enough space"). However, the check is wrong in two ways. First, it checks the condition for every raid index (ffe_ctl->index). Even if it reaches the condition and "ffe_ctl->max_extent_size >= ffe_ctl->min_alloc_size" is met, there can be other block groups having enough space to hold ffe_ctl->num_bytes. (Actually, this won't happen in the current zoned code as it only supports SINGLE profile. But, it can happen once it enables other RAID types.) Second, it checks the active zone availability depending on the raid index. The raid index is just an index for space_info->block_groups, so it has nothing to do with chunk allocation. These mistakes are causing a faulty allocation in a certain situation. Consider we are running zoned btrfs on a device whose max_active_zone == 0 (no limit). And, suppose no block group have a room to fit ffe_ctl->num_bytes but some room to meet ffe_ctl->min_alloc_size (i.e. max_extent_size > num_bytes >= min_alloc_size). In this situation, the following occur: - With SINGLE raid_index, it reaches the chunk allocation checking code - The check returns true because we can activate a new zone (no limit) - But, before allocating the chunk, it iterates to the next raid index (RAID5) - Since there are no RAID5 block groups on zoned mode, it again reaches the check code - The check returns false because of btrfs_can_activate_zone()'s "if (raid_index != BTRFS_RAID_SINGLE)" part - That results in returning -ENOSPC without allocating a new chunk As a result, we end up hitting -ENOSPC too early. Move the check to the right place in the can_allocate_chunk() hook, and do the active zone check depending on the allocation flag, not on the raid index. CC: [email protected] # 5.16 Signed-off-by: Naohiro Aota <[email protected]> Signed-off-by: David Sterba <[email protected]>
1 parent 50475cd commit 82187d2

File tree

3 files changed

+13
-18
lines changed

3 files changed

+13
-18
lines changed

fs/btrfs/extent-tree.c

Lines changed: 9 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -3981,6 +3981,15 @@ static bool can_allocate_chunk(struct btrfs_fs_info *fs_info,
39813981
case BTRFS_EXTENT_ALLOC_CLUSTERED:
39823982
return true;
39833983
case BTRFS_EXTENT_ALLOC_ZONED:
3984+
/*
3985+
* If we have enough free space left in an already
3986+
* active block group and we can't activate any other
3987+
* zone now, do not allow allocating a new chunk and
3988+
* let find_free_extent() retry with a smaller size.
3989+
*/
3990+
if (ffe_ctl->max_extent_size >= ffe_ctl->min_alloc_size &&
3991+
!btrfs_can_activate_zone(fs_info->fs_devices, ffe_ctl->flags))
3992+
return false;
39843993
return true;
39853994
default:
39863995
BUG();
@@ -4027,18 +4036,6 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
40274036
return 0;
40284037
}
40294038

4030-
if (ffe_ctl->max_extent_size >= ffe_ctl->min_alloc_size &&
4031-
!btrfs_can_activate_zone(fs_info->fs_devices, ffe_ctl->index)) {
4032-
/*
4033-
* If we have enough free space left in an already active block
4034-
* group and we can't activate any other zone now, retry the
4035-
* active ones with a smaller allocation size. Returning early
4036-
* from here will tell btrfs_reserve_extent() to haven the
4037-
* size.
4038-
*/
4039-
return -ENOSPC;
4040-
}
4041-
40424039
if (ffe_ctl->loop >= LOOP_CACHING_WAIT && ffe_ctl->have_caching_bg)
40434040
return 1;
40444041

fs/btrfs/zoned.c

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1925,7 +1925,7 @@ int btrfs_zone_finish(struct btrfs_block_group *block_group)
19251925
return ret;
19261926
}
19271927

1928-
bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, int raid_index)
1928+
bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags)
19291929
{
19301930
struct btrfs_device *device;
19311931
bool ret = false;
@@ -1934,8 +1934,7 @@ bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, int raid_index
19341934
return true;
19351935

19361936
/* Non-single profiles are not supported yet */
1937-
if (raid_index != BTRFS_RAID_SINGLE)
1938-
return false;
1937+
ASSERT((flags & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0);
19391938

19401939
/* Check if there is a device with active zones left */
19411940
mutex_lock(&fs_devices->device_list_mutex);

fs/btrfs/zoned.h

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -73,8 +73,7 @@ struct btrfs_device *btrfs_zoned_get_device(struct btrfs_fs_info *fs_info,
7373
u64 logical, u64 length);
7474
bool btrfs_zone_activate(struct btrfs_block_group *block_group);
7575
int btrfs_zone_finish(struct btrfs_block_group *block_group);
76-
bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices,
77-
int raid_index);
76+
bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags);
7877
void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical,
7978
u64 length);
8079
void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg);
@@ -226,7 +225,7 @@ static inline int btrfs_zone_finish(struct btrfs_block_group *block_group)
226225
}
227226

228227
static inline bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices,
229-
int raid_index)
228+
u64 flags)
230229
{
231230
return true;
232231
}

0 commit comments

Comments
 (0)