Skip to content

Commit 70e7730

Browse files
committed
Merge tag 'vfs-6.13.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner: "Features: - Fixup and improve NLM and kNFSD file lock callbacks Last year both GFS2 and OCFS2 had some work done to make their locking more robust when exported over NFS. Unfortunately, part of that work caused both NLM (for NFS v3 exports) and kNFSD (for NFSv4.1+ exports) to no longer send lock notifications to clients This in itself is not a huge problem because most NFS clients will still poll the server in order to acquire a conflicted lock It's important for NLM and kNFSD that they do not block their kernel threads inside filesystem's file_lock implementations because that can produce deadlocks. We used to make sure of this by only trusting that posix_lock_file() can correctly handle blocking lock calls asynchronously, so the lock managers would only setup their file_lock requests for async callbacks if the filesystem did not define its own lock() file operation However, when GFS2 and OCFS2 grew the capability to correctly handle blocking lock requests asynchronously, they started signalling this behavior with EXPORT_OP_ASYNC_LOCK, and the check for also trusting posix_lock_file() was inadvertently dropped, so now most filesystems no longer produce lock notifications when exported over NFS Fix this by using an fop_flag which greatly simplifies the problem and grooms the way for future uses by both filesystems and lock managers alike - Add a sysctl to delete the dentry when a file is removed instead of making it a negative dentry Commit 681ce86 ("vfs: Delete the associated dentry when deleting a file") introduced an unconditional deletion of the associated dentry when a file is removed. However, this led to performance regressions in specific benchmarks, such as ilebench.sum_operations/s, prompting a revert in commit 4a4be1a ("Revert "vfs: Delete the associated dentry when deleting a file""). This reintroduces the concept conditionally through a sysctl - Expand the statmount() system call: * Report the filesystem subtype in a new fs_subtype field to e.g., report fuse filesystem subtypes * Report the superblock source in a new sb_source field * Add a new way to return filesystem specific mount options in an option array that returns filesystem specific mount options separated by zero bytes and unescaped. This allows caller's to retrieve filesystem specific mount options and immediately pass them to e.g., fsconfig() without having to unescape or split them * Report security (LSM) specific mount options in a separate security option array. We don't lump them together with filesystem specific mount options as security mount options are generic and most users aren't interested in them The format is the same as for the filesystem specific mount option array - Support relative paths in fsconfig()'s FSCONFIG_SET_STRING command - Optimize acl_permission_check() to avoid costly {g,u}id ownership checks if possible - Use smp_mb__after_spinlock() to avoid full smp_mb() in evict() - Add synchronous wakeup support for ep_poll_callback. Currently, epoll only uses wake_up() to wake up task. But sometimes there are epoll users which want to use the synchronous wakeup flag to give a hint to the scheduler, e.g., the Android binder driver. So add a wake_up_sync() define, and use wake_up_sync() when sync is true in ep_poll_callback() Fixes: - Fix kernel documentation for inode_insert5() and iget5_locked() - Annotate racy epoll check on file->f_ep - Make F_DUPFD_QUERY associative - Avoid filename buffer overrun in initramfs - Don't let statmount() return empty strings - Add a cond_resched() to dump_user_range() to avoid hogging the CPU - Don't query the device logical blocksize multiple times for hfsplus - Make filemap_read() check that the offset is positive or zero Cleanups: - Various typo fixes - Cleanup wbc_attach_fdatawrite_inode() - Add __releases annotation to wbc_attach_and_unlock_inode() - Add hugetlbfs tracepoints - Fix various vfs kernel doc parameters - Remove obsolete TODO comment from io_cancel() - Convert wbc_account_cgroup_owner() to take a folio - Fix comments for BANDWITH_INTERVAL and wb_domain_writeout_add() - Reorder struct posix_acl to save 8 bytes - Annotate struct posix_acl with __counted_by() - Replace one-element array with flexible array member in freevxfs - Use idiomatic atomic64_inc_return() in alloc_mnt_ns()" * tag 'vfs-6.13.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (35 commits) statmount: retrieve security mount options vfs: make evict() use smp_mb__after_spinlock instead of smp_mb statmount: add flag to retrieve unescaped options fs: add the ability for statmount() to report the sb_source writeback: wbc_attach_fdatawrite_inode out of line writeback: add a __releases annoation to wbc_attach_and_unlock_inode fs: add the ability for statmount() to report the fs_subtype fs: don't let statmount return empty strings fs:aio: Remove TODO comment suggesting hash or array usage in io_cancel() hfsplus: don't query the device logical block size multiple times freevxfs: Replace one-element array with flexible array member fs: optimize acl_permission_check() initramfs: avoid filename buffer overrun fs/writeback: convert wbc_account_cgroup_owner to take a folio acl: Annotate struct posix_acl with __counted_by() acl: Realign struct posix_acl to save 8 bytes epoll: Add synchronous wakeup support for ep_poll_callback coredump: add cond_resched() to dump_user_range mm/page-writeback.c: Fix comment of wb_domain_writeout_add() mm/page-writeback.c: Update comment for BANDWIDTH_INTERVAL ...
2 parents 4eb98b7 + aefff51 commit 70e7730

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+531
-141
lines changed

Documentation/admin-guide/cgroup-v2.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2954,7 +2954,7 @@ following two functions.
29542954
a queue (device) has been associated with the bio and
29552955
before submission.
29562956

2957-
wbc_account_cgroup_owner(@wbc, @page, @bytes)
2957+
wbc_account_cgroup_owner(@wbc, @folio, @bytes)
29582958
Should be called for each data segment being written out.
29592959
While this function doesn't care exactly when it's called
29602960
during the writeback session, it's the easiest and most

Documentation/admin-guide/sysctl/fs.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,11 @@ requests. ``aio-max-nr`` allows you to change the maximum value
3838
``aio-max-nr`` does not result in the
3939
pre-allocation or re-sizing of any kernel data structures.
4040

41+
dentry-negative
42+
----------------------------
43+
44+
Policy for negative dentries. Set to 1 to to always delete the dentry when a
45+
file is removed, and 0 to disable it. By default, this behavior is disabled.
4146

4247
dentry-state
4348
------------

Documentation/filesystems/nfs/exporting.rst

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -238,10 +238,3 @@ following flags are defined:
238238
all of an inode's dirty data on last close. Exports that behave this
239239
way should set EXPORT_OP_FLUSH_ON_CLOSE so that NFSD knows to skip
240240
waiting for writeback when closing such files.
241-
242-
EXPORT_OP_ASYNC_LOCK - Indicates a capable filesystem to do async lock
243-
requests from lockd. Only set EXPORT_OP_ASYNC_LOCK if the filesystem has
244-
it's own ->lock() functionality as core posix_lock_file() implementation
245-
has no async lock request handling yet. For more information about how to
246-
indicate an async lock request from a ->lock() file_operations struct, see
247-
fs/locks.c and comment for the function vfs_lock_file().

MAINTAINERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10493,6 +10493,7 @@ F: Documentation/mm/hugetlbfs_reserv.rst
1049310493
F: Documentation/mm/vmemmap_dedup.rst
1049410494
F: fs/hugetlbfs/
1049510495
F: include/linux/hugetlb.h
10496+
F: include/trace/events/hugetlbfs.h
1049610497
F: mm/hugetlb.c
1049710498
F: mm/hugetlb_vmemmap.c
1049810499
F: mm/hugetlb_vmemmap.h

fs/aio.c

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2191,7 +2191,6 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
21912191
return -EINVAL;
21922192

21932193
spin_lock_irq(&ctx->ctx_lock);
2194-
/* TODO: use a hash or array, this sucks. */
21952194
list_for_each_entry(kiocb, &ctx->active_reqs, ki_list) {
21962195
if (kiocb->ki_res.obj == obj) {
21972196
ret = kiocb->ki_cancel(&kiocb->rw);

fs/btrfs/extent_io.c

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -786,7 +786,7 @@ static void submit_extent_folio(struct btrfs_bio_ctrl *bio_ctrl,
786786
}
787787

788788
if (bio_ctrl->wbc)
789-
wbc_account_cgroup_owner(bio_ctrl->wbc, &folio->page,
789+
wbc_account_cgroup_owner(bio_ctrl->wbc, folio,
790790
len);
791791

792792
size -= len;
@@ -1708,7 +1708,7 @@ static noinline_for_stack void write_one_eb(struct extent_buffer *eb,
17081708
ret = bio_add_folio(&bbio->bio, folio, eb->len,
17091709
eb->start - folio_pos(folio));
17101710
ASSERT(ret);
1711-
wbc_account_cgroup_owner(wbc, folio_page(folio, 0), eb->len);
1711+
wbc_account_cgroup_owner(wbc, folio, eb->len);
17121712
folio_unlock(folio);
17131713
} else {
17141714
int num_folios = num_extent_folios(eb);
@@ -1722,8 +1722,7 @@ static noinline_for_stack void write_one_eb(struct extent_buffer *eb,
17221722
folio_start_writeback(folio);
17231723
ret = bio_add_folio(&bbio->bio, folio, eb->folio_size, 0);
17241724
ASSERT(ret);
1725-
wbc_account_cgroup_owner(wbc, folio_page(folio, 0),
1726-
eb->folio_size);
1725+
wbc_account_cgroup_owner(wbc, folio, eb->folio_size);
17271726
wbc->nr_to_write -= folio_nr_pages(folio);
17281727
folio_unlock(folio);
17291728
}

fs/btrfs/inode.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1729,7 +1729,7 @@ static bool run_delalloc_compressed(struct btrfs_inode *inode,
17291729
* need full accuracy. Just account the whole thing
17301730
* against the first page.
17311731
*/
1732-
wbc_account_cgroup_owner(wbc, &locked_folio->page,
1732+
wbc_account_cgroup_owner(wbc, locked_folio,
17331733
cur_end - start);
17341734
async_chunk[i].locked_folio = locked_folio;
17351735
locked_folio = NULL;

fs/buffer.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2803,7 +2803,7 @@ static void submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh,
28032803
bio->bi_iter.bi_sector = bh->b_blocknr * (bh->b_size >> 9);
28042804
bio->bi_write_hint = write_hint;
28052805

2806-
__bio_add_page(bio, bh->b_page, bh->b_size, bh_offset(bh));
2806+
bio_add_folio_nofail(bio, bh->b_folio, bh->b_size, bh_offset(bh));
28072807

28082808
bio->bi_end_io = end_bio_bh_io_sync;
28092809
bio->bi_private = bh;
@@ -2813,7 +2813,7 @@ static void submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh,
28132813

28142814
if (wbc) {
28152815
wbc_init_bio(wbc, bio);
2816-
wbc_account_cgroup_owner(wbc, bh->b_page, bh->b_size);
2816+
wbc_account_cgroup_owner(wbc, bh->b_folio, bh->b_size);
28172817
}
28182818

28192819
submit_bio(bio);

fs/char_dev.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -562,8 +562,8 @@ int cdev_device_add(struct cdev *cdev, struct device *dev)
562562

563563
/**
564564
* cdev_device_del() - inverse of cdev_device_add
565-
* @dev: the device structure
566565
* @cdev: the cdev structure
566+
* @dev: the device structure
567567
*
568568
* cdev_device_del() is a helper function to call cdev_del and device_del.
569569
* It should be used whenever cdev_device_add is used.

fs/coredump.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -951,6 +951,7 @@ int dump_user_range(struct coredump_params *cprm, unsigned long start,
951951
} else {
952952
dump_skip(cprm, PAGE_SIZE);
953953
}
954+
cond_resched();
954955
}
955956
dump_page_free(dump_page);
956957
return 1;

0 commit comments

Comments
 (0)