Skip to content

Commit d5acbc6

Browse files
committed
Merge tag 'for-6.7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba: "New features: - raid-stripe-tree New tree for logical file extent mapping where the physical mapping may not match on multiple devices. This is now used in zoned mode to implement RAID0/RAID1* profiles, but can be used in non-zoned mode as well. The support for RAID56 is in development and will eventually fix the problems with the current implementation. This is a backward incompatible feature and has to be enabled at mkfs time. - simple quota accounting (squota) A simplified mode of qgroup that accounts all space on the initial extent owners (a subvolume), the snapshots are then cheap to create and delete. The deletion of snapshots in fully accounting qgroups is a known CPU/IO performance bottleneck. The squota is not suitable for the general use case but works well for containers where the original subvolume exists for the whole time. This is a backward incompatible feature as it needs extending some structures, but can be enabled on an existing filesystem. - temporary filesystem fsid (temp_fsid) The fsid identifies a filesystem and is hard coded in the structures, which disallows mounting the same fsid found on different devices. For a single device filesystem this is not strictly necessary, a new temporary fsid can be generated on mount e.g. after a device is cloned. This will be used by Steam Deck for root partition A/B testing, or can be used for VM root images. Other user visible changes: - filesystems with partially finished metadata_uuid conversion cannot be mounted anymore and the uuid fixup has to be done by btrfs-progs (btrfstune). Performance improvements: - reduce reservations for checksum deletions (with enabled free space tree by factor of 4), on a sample workload on file with many extents the deletion time decreased by 12% - make extent state merges more efficient during insertions, reduce rb-tree iterations (run time of critical functions reduced by 5%) Core changes: - the integrity check functionality has been removed, this was a debugging feature and removal does not affect other integrity checks like checksums or tree-checker - space reservation changes: - more efficient delayed ref reservations, this avoids building up too much work or overusing or exhausting the global block reserve in some situations - move delayed refs reservation to the transaction start time, this prevents some ENOSPC corner cases related to exhaustion of global reserve - improvements in reducing excessive reservations for block group items - adjust overcommit logic in near full situations, account for one more chunk to eventually allocate metadata chunk, this is mostly relevant for small filesystems (<10GiB) - single device filesystems are scanned but not registered (except seed devices), this allows temp_fsid to work - qgroup iterations do not need GFP_ATOMIC allocations anymore - cleanups, refactoring, reduced data structure size, function parameter simplifications, error handling fixes" * tag 'for-6.7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (156 commits) btrfs: open code timespec64 in struct btrfs_inode btrfs: remove redundant log root tree index assignment during log sync btrfs: remove redundant initialization of variable dirty in btrfs_update_time() btrfs: sysfs: show temp_fsid feature btrfs: disable the device add feature for temp-fsid btrfs: disable the seed feature for temp-fsid btrfs: update comment for temp-fsid, fsid, and metadata_uuid btrfs: remove pointless empty log context list check when syncing log btrfs: update comment for struct btrfs_inode::lock btrfs: remove pointless barrier from btrfs_sync_file() btrfs: add and use helpers for reading and writing last_trans_committed btrfs: add and use helpers for reading and writing fs_info->generation btrfs: add and use helpers for reading and writing log_transid btrfs: add and use helpers for reading and writing last_log_commit btrfs: support cloned-device mount capability btrfs: add helper function find_fsid_by_disk btrfs: stop reserving excessive space for block group item insertions btrfs: stop reserving excessive space for block group item updates btrfs: reorder btrfs_inode to fill gaps btrfs: open code btrfs_ordered_inode_tree in btrfs_inode ...
2 parents 8829687 + c6e8f89 commit d5acbc6

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

83 files changed

+4033
-5291
lines changed

fs/btrfs/Kconfig

Lines changed: 0 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -48,27 +48,6 @@ config BTRFS_FS_POSIX_ACL
4848

4949
If you don't know what Access Control Lists are, say N
5050

51-
config BTRFS_FS_CHECK_INTEGRITY
52-
bool "Btrfs with integrity check tool compiled in (DEPRECATED)"
53-
depends on BTRFS_FS
54-
help
55-
This feature has been deprecated and will be removed in 6.7.
56-
57-
Adds code that examines all block write requests (including
58-
writes of the super block). The goal is to verify that the
59-
state of the filesystem on disk is always consistent, i.e.,
60-
after a power-loss or kernel panic event the filesystem is
61-
in a consistent state.
62-
63-
If the integrity check tool is included and activated in
64-
the mount options, plenty of kernel memory is used, and
65-
plenty of additional CPU cycles are spent. Enabling this
66-
functionality is not intended for normal use.
67-
68-
In most cases, unless you are a btrfs developer who needs
69-
to verify the integrity of (super)-block write requests
70-
during the run of a regression test, say N
71-
7251
config BTRFS_FS_RUN_SANITY_TESTS
7352
bool "Btrfs will run sanity tests upon loading"
7453
depends on BTRFS_FS

fs/btrfs/Makefile

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,10 +33,9 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \
3333
uuid-tree.o props.o free-space-tree.o tree-checker.o space-info.o \
3434
block-rsv.o delalloc-space.o block-group.o discard.o reflink.o \
3535
subpage.o tree-mod-log.o extent-io-tree.o fs.o messages.o bio.o \
36-
lru_cache.o
36+
lru_cache.o raid-stripe-tree.o
3737

3838
btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o
39-
btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o
4039
btrfs-$(CONFIG_BTRFS_FS_REF_VERIFY) += ref-verify.o
4140
btrfs-$(CONFIG_BLK_DEV_ZONED) += zoned.o
4241
btrfs-$(CONFIG_FS_VERITY) += verity.o

fs/btrfs/accessors.h

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
#define BTRFS_ACCESSORS_H
55

66
#include <linux/stddef.h>
7+
#include <asm/unaligned.h>
78

89
struct btrfs_map_token {
910
struct extent_buffer *eb;
@@ -305,6 +306,14 @@ BTRFS_SETGET_FUNCS(timespec_nsec, struct btrfs_timespec, nsec, 32);
305306
BTRFS_SETGET_STACK_FUNCS(stack_timespec_sec, struct btrfs_timespec, sec, 64);
306307
BTRFS_SETGET_STACK_FUNCS(stack_timespec_nsec, struct btrfs_timespec, nsec, 32);
307308

309+
BTRFS_SETGET_FUNCS(stripe_extent_encoding, struct btrfs_stripe_extent, encoding, 8);
310+
BTRFS_SETGET_FUNCS(raid_stride_devid, struct btrfs_raid_stride, devid, 64);
311+
BTRFS_SETGET_FUNCS(raid_stride_physical, struct btrfs_raid_stride, physical, 64);
312+
BTRFS_SETGET_STACK_FUNCS(stack_stripe_extent_encoding,
313+
struct btrfs_stripe_extent, encoding, 8);
314+
BTRFS_SETGET_STACK_FUNCS(stack_raid_stride_devid, struct btrfs_raid_stride, devid, 64);
315+
BTRFS_SETGET_STACK_FUNCS(stack_raid_stride_physical, struct btrfs_raid_stride, physical, 64);
316+
308317
/* struct btrfs_dev_extent */
309318
BTRFS_SETGET_FUNCS(dev_extent_chunk_tree, struct btrfs_dev_extent, chunk_tree, 64);
310319
BTRFS_SETGET_FUNCS(dev_extent_chunk_objectid, struct btrfs_dev_extent,
@@ -349,6 +358,9 @@ BTRFS_SETGET_FUNCS(extent_data_ref_count, struct btrfs_extent_data_ref, count, 3
349358

350359
BTRFS_SETGET_FUNCS(shared_data_ref_count, struct btrfs_shared_data_ref, count, 32);
351360

361+
BTRFS_SETGET_FUNCS(extent_owner_ref_root_id, struct btrfs_extent_owner_ref,
362+
root_id, 64);
363+
352364
BTRFS_SETGET_FUNCS(extent_inline_ref_type, struct btrfs_extent_inline_ref,
353365
type, 8);
354366
BTRFS_SETGET_FUNCS(extent_inline_ref_offset, struct btrfs_extent_inline_ref,
@@ -365,6 +377,8 @@ static inline u32 btrfs_extent_inline_ref_size(int type)
365377
if (type == BTRFS_EXTENT_DATA_REF_KEY)
366378
return sizeof(struct btrfs_extent_data_ref) +
367379
offsetof(struct btrfs_extent_inline_ref, offset);
380+
if (type == BTRFS_EXTENT_OWNER_REF_KEY)
381+
return sizeof(struct btrfs_extent_inline_ref);
368382
return 0;
369383
}
370384

@@ -966,6 +980,8 @@ BTRFS_SETGET_FUNCS(qgroup_status_flags, struct btrfs_qgroup_status_item,
966980
flags, 64);
967981
BTRFS_SETGET_FUNCS(qgroup_status_rescan, struct btrfs_qgroup_status_item,
968982
rescan, 64);
983+
BTRFS_SETGET_FUNCS(qgroup_status_enable_gen, struct btrfs_qgroup_status_item,
984+
enable_gen, 64);
969985

970986
/* btrfs_qgroup_info_item */
971987
BTRFS_SETGET_FUNCS(qgroup_info_generation, struct btrfs_qgroup_info_item,

fs/btrfs/async-thread.c

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
#include <linux/list.h>
1010
#include <linux/spinlock.h>
1111
#include <linux/freezer.h>
12+
#include <trace/events/btrfs.h>
1213
#include "async-thread.h"
1314
#include "ctree.h"
1415

@@ -242,7 +243,7 @@ static void run_ordered_work(struct btrfs_workqueue *wq,
242243
break;
243244
trace_btrfs_ordered_sched(work);
244245
spin_unlock_irqrestore(lock, flags);
245-
work->ordered_func(work);
246+
work->ordered_func(work, false);
246247

247248
/* now take the lock again and drop our item from the list */
248249
spin_lock_irqsave(lock, flags);
@@ -277,15 +278,15 @@ static void run_ordered_work(struct btrfs_workqueue *wq,
277278
* We don't want to call the ordered free functions with
278279
* the lock held.
279280
*/
280-
work->ordered_free(work);
281+
work->ordered_func(work, true);
281282
/* NB: work must not be dereferenced past this point. */
282283
trace_btrfs_all_work_done(wq->fs_info, work);
283284
}
284285
}
285286
spin_unlock_irqrestore(lock, flags);
286287

287288
if (free_self) {
288-
self->ordered_free(self);
289+
self->ordered_func(self, true);
289290
/* NB: self must not be dereferenced past this point. */
290291
trace_btrfs_all_work_done(wq->fs_info, self);
291292
}
@@ -300,7 +301,7 @@ static void btrfs_work_helper(struct work_struct *normal_work)
300301

301302
/*
302303
* We should not touch things inside work in the following cases:
303-
* 1) after work->func() if it has no ordered_free
304+
* 1) after work->func() if it has no ordered_func(..., true) to free
304305
* Since the struct is freed in work->func().
305306
* 2) after setting WORK_DONE_BIT
306307
* The work may be freed in other threads almost instantly.
@@ -329,11 +330,10 @@ static void btrfs_work_helper(struct work_struct *normal_work)
329330
}
330331

331332
void btrfs_init_work(struct btrfs_work *work, btrfs_func_t func,
332-
btrfs_func_t ordered_func, btrfs_func_t ordered_free)
333+
btrfs_ordered_func_t ordered_func)
333334
{
334335
work->func = func;
335336
work->ordered_func = ordered_func;
336-
work->ordered_free = ordered_free;
337337
INIT_WORK(&work->normal_work, btrfs_work_helper);
338338
INIT_LIST_HEAD(&work->ordered_list);
339339
work->flags = 0;

fs/btrfs/async-thread.h

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,11 @@ struct btrfs_fs_info;
1313
struct btrfs_workqueue;
1414
struct btrfs_work;
1515
typedef void (*btrfs_func_t)(struct btrfs_work *arg);
16+
typedef void (*btrfs_ordered_func_t)(struct btrfs_work *arg, bool);
1617

1718
struct btrfs_work {
1819
btrfs_func_t func;
19-
btrfs_func_t ordered_func;
20-
btrfs_func_t ordered_free;
20+
btrfs_ordered_func_t ordered_func;
2121

2222
/* Don't touch things below */
2323
struct work_struct normal_work;
@@ -35,7 +35,7 @@ struct btrfs_workqueue *btrfs_alloc_ordered_workqueue(
3535
struct btrfs_fs_info *fs_info, const char *name,
3636
unsigned int flags);
3737
void btrfs_init_work(struct btrfs_work *work, btrfs_func_t func,
38-
btrfs_func_t ordered_func, btrfs_func_t ordered_free);
38+
btrfs_ordered_func_t ordered_func);
3939
void btrfs_queue_work(struct btrfs_workqueue *wq,
4040
struct btrfs_work *work);
4141
void btrfs_destroy_workqueue(struct btrfs_workqueue *wq);

fs/btrfs/backref.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1129,6 +1129,9 @@ static int add_inline_refs(struct btrfs_backref_walk_ctx *ctx,
11291129
count, sc, GFP_NOFS);
11301130
break;
11311131
}
1132+
case BTRFS_EXTENT_OWNER_REF_KEY:
1133+
ASSERT(btrfs_fs_incompat(ctx->fs_info, SIMPLE_QUOTA));
1134+
break;
11321135
default:
11331136
WARN_ON(1);
11341137
}
@@ -2998,7 +3001,7 @@ int btrfs_backref_iter_next(struct btrfs_backref_iter *iter)
29983001
}
29993002

30003003
void btrfs_backref_init_cache(struct btrfs_fs_info *fs_info,
3001-
struct btrfs_backref_cache *cache, int is_reloc)
3004+
struct btrfs_backref_cache *cache, bool is_reloc)
30023005
{
30033006
int i;
30043007

fs/btrfs/backref.h

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -247,7 +247,7 @@ struct prelim_ref {
247247
struct rb_node rbnode;
248248
u64 root_id;
249249
struct btrfs_key key_for_search;
250-
int level;
250+
u8 level;
251251
int count;
252252
struct extent_inode_elem *inode_list;
253253
u64 parent;
@@ -440,11 +440,11 @@ struct btrfs_backref_cache {
440440
* Reloction backref cache require more info for reloc root compared
441441
* to generic backref cache.
442442
*/
443-
unsigned int is_reloc;
443+
bool is_reloc;
444444
};
445445

446446
void btrfs_backref_init_cache(struct btrfs_fs_info *fs_info,
447-
struct btrfs_backref_cache *cache, int is_reloc);
447+
struct btrfs_backref_cache *cache, bool is_reloc);
448448
struct btrfs_backref_node *btrfs_backref_alloc_node(
449449
struct btrfs_backref_cache *cache, u64 bytenr, int level);
450450
struct btrfs_backref_edge *btrfs_backref_alloc_edge(
@@ -533,9 +533,9 @@ void btrfs_backref_cleanup_node(struct btrfs_backref_cache *cache,
533533
void btrfs_backref_release_cache(struct btrfs_backref_cache *cache);
534534

535535
static inline void btrfs_backref_panic(struct btrfs_fs_info *fs_info,
536-
u64 bytenr, int errno)
536+
u64 bytenr, int error)
537537
{
538-
btrfs_panic(fs_info, errno,
538+
btrfs_panic(fs_info, error,
539539
"Inconsistency in backref cache found at offset %llu",
540540
bytenr);
541541
}

fs/btrfs/bio.c

Lines changed: 33 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,11 @@
1010
#include "volumes.h"
1111
#include "raid56.h"
1212
#include "async-thread.h"
13-
#include "check-integrity.h"
1413
#include "dev-replace.h"
1514
#include "rcu-string.h"
1615
#include "zoned.h"
1716
#include "file-item.h"
17+
#include "raid-stripe-tree.h"
1818

1919
static struct bio_set btrfs_bioset;
2020
static struct bio_set btrfs_clone_bioset;
@@ -416,6 +416,9 @@ static void btrfs_orig_write_end_io(struct bio *bio)
416416
else
417417
bio->bi_status = BLK_STS_OK;
418418

419+
if (bio_op(bio) == REQ_OP_ZONE_APPEND && !bio->bi_status)
420+
stripe->physical = bio->bi_iter.bi_sector << SECTOR_SHIFT;
421+
419422
btrfs_orig_bbio_end_io(bbio);
420423
btrfs_put_bioc(bioc);
421424
}
@@ -427,6 +430,8 @@ static void btrfs_clone_write_end_io(struct bio *bio)
427430
if (bio->bi_status) {
428431
atomic_inc(&stripe->bioc->error);
429432
btrfs_log_dev_io_error(bio, stripe->dev);
433+
} else if (bio_op(bio) == REQ_OP_ZONE_APPEND) {
434+
stripe->physical = bio->bi_iter.bi_sector << SECTOR_SHIFT;
430435
}
431436

432437
/* Pass on control to the original bio this one was cloned from */
@@ -463,8 +468,6 @@ static void btrfs_submit_dev_bio(struct btrfs_device *dev, struct bio *bio)
463468
(unsigned long)dev->bdev->bd_dev, btrfs_dev_name(dev),
464469
dev->devid, bio->bi_iter.bi_size);
465470

466-
btrfsic_check_bio(bio);
467-
468471
if (bio->bi_opf & REQ_BTRFS_CGROUP_PUNT)
469472
blkcg_punt_bio_submit(bio);
470473
else
@@ -490,6 +493,7 @@ static void btrfs_submit_mirrored_bio(struct btrfs_io_context *bioc, int dev_nr)
490493
bio->bi_private = &bioc->stripes[dev_nr];
491494
bio->bi_iter.bi_sector = bioc->stripes[dev_nr].physical >> SECTOR_SHIFT;
492495
bioc->stripes[dev_nr].bioc = bioc;
496+
bioc->size = bio->bi_iter.bi_size;
493497
btrfs_submit_dev_bio(bioc->stripes[dev_nr].dev, bio);
494498
}
495499

@@ -499,6 +503,8 @@ static void __btrfs_submit_bio(struct bio *bio, struct btrfs_io_context *bioc,
499503
if (!bioc) {
500504
/* Single mirror read/write fast path. */
501505
btrfs_bio(bio)->mirror_num = mirror_num;
506+
if (bio_op(bio) != REQ_OP_READ)
507+
btrfs_bio(bio)->orig_physical = smap->physical;
502508
bio->bi_iter.bi_sector = smap->physical >> SECTOR_SHIFT;
503509
if (bio_op(bio) != REQ_OP_READ)
504510
btrfs_bio(bio)->orig_physical = smap->physical;
@@ -568,13 +574,20 @@ static void run_one_async_start(struct btrfs_work *work)
568574
*
569575
* At IO completion time the csums attached on the ordered extent record are
570576
* inserted into the tree.
577+
*
578+
* If called with @do_free == true, then it will free the work struct.
571579
*/
572-
static void run_one_async_done(struct btrfs_work *work)
580+
static void run_one_async_done(struct btrfs_work *work, bool do_free)
573581
{
574582
struct async_submit_bio *async =
575583
container_of(work, struct async_submit_bio, work);
576584
struct bio *bio = &async->bbio->bio;
577585

586+
if (do_free) {
587+
kfree(container_of(work, struct async_submit_bio, work));
588+
return;
589+
}
590+
578591
/* If an error occurred we just want to clean up the bio and move on. */
579592
if (bio->bi_status) {
580593
btrfs_orig_bbio_end_io(async->bbio);
@@ -590,11 +603,6 @@ static void run_one_async_done(struct btrfs_work *work)
590603
__btrfs_submit_bio(bio, async->bioc, &async->smap, async->mirror_num);
591604
}
592605

593-
static void run_one_async_free(struct btrfs_work *work)
594-
{
595-
kfree(container_of(work, struct async_submit_bio, work));
596-
}
597-
598606
static bool should_async_write(struct btrfs_bio *bbio)
599607
{
600608
/* Submit synchronously if the checksum implementation is fast. */
@@ -636,8 +644,7 @@ static bool btrfs_wq_submit_bio(struct btrfs_bio *bbio,
636644
async->smap = *smap;
637645
async->mirror_num = mirror_num;
638646

639-
btrfs_init_work(&async->work, run_one_async_start, run_one_async_done,
640-
run_one_async_free);
647+
btrfs_init_work(&async->work, run_one_async_start, run_one_async_done);
641648
btrfs_queue_work(fs_info->workers, &async->work);
642649
return true;
643650
}
@@ -657,9 +664,11 @@ static bool btrfs_submit_chunk(struct btrfs_bio *bbio, int mirror_num)
657664
blk_status_t ret;
658665
int error;
659666

667+
smap.is_scrub = !bbio->inode;
668+
660669
btrfs_bio_counter_inc_blocked(fs_info);
661670
error = btrfs_map_block(fs_info, btrfs_op(bio), logical, &map_length,
662-
&bioc, &smap, &mirror_num, 1);
671+
&bioc, &smap, &mirror_num);
663672
if (error) {
664673
ret = errno_to_blk_status(error);
665674
goto fail;
@@ -691,6 +700,18 @@ static bool btrfs_submit_chunk(struct btrfs_bio *bbio, int mirror_num)
691700
bio->bi_opf |= REQ_OP_ZONE_APPEND;
692701
}
693702

703+
if (is_data_bbio(bbio) && bioc &&
704+
btrfs_need_stripe_tree_update(bioc->fs_info, bioc->map_type)) {
705+
/*
706+
* No locking for the list update, as we only add to
707+
* the list in the I/O submission path, and list
708+
* iteration only happens in the completion path, which
709+
* can't happen until after the last submission.
710+
*/
711+
btrfs_get_bioc(bioc);
712+
list_add_tail(&bioc->rst_ordered_entry, &bbio->ordered->bioc_list);
713+
}
714+
694715
/*
695716
* Csum items for reloc roots have already been cloned at this
696717
* point, so they are handled as part of the no-checksum case.
@@ -779,8 +800,6 @@ int btrfs_repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 start,
779800
bio_init(&bio, smap.dev->bdev, &bvec, 1, REQ_OP_WRITE | REQ_SYNC);
780801
bio.bi_iter.bi_sector = smap.physical >> SECTOR_SHIFT;
781802
__bio_add_page(&bio, page, length, pg_offset);
782-
783-
btrfsic_check_bio(&bio);
784803
ret = submit_bio_wait(&bio);
785804
if (ret) {
786805
/* try to remap that extent elsewhere? */

0 commit comments

Comments
 (0)