Skip to content

Commit 4b84a4c

Browse files
committed
Merge tag 'vfs-6.14-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner: "Features: - Support caching symlink lengths in inodes The size is stored in a new union utilizing the same space as i_devices, thus avoiding growing the struct or taking up any more space When utilized it dodges strlen() in vfs_readlink(), giving about 1.5% speed up when issuing readlink on /initrd.img on ext4 - Add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag If a file system supports uncached buffered IO, it may set FOP_DONTCACHE and enable support for RWF_DONTCACHE. If RWF_DONTCACHE is attempted without the file system supporting it, it'll get errored with -EOPNOTSUPP - Enable VBOXGUEST and VBOXSF_FS on ARM64 Now that VirtualBox is able to run as a host on arm64 (e.g. the Apple M3 processors) we can enable VBOXSF_FS (and in turn VBOXGUEST) for this architecture. Tested with various runs of bonnie++ and dbench on an Apple MacBook Pro with the latest Virtualbox 7.1.4 r165100 installed Cleanups: - Delay sysctl_nr_open check in expand_files() - Use kernel-doc includes in fiemap docbook - Use page->private instead of page->index in watch_queue - Use a consume fence in mnt_idmap() as it's heavily used in link_path_walk() - Replace magic number 7 with ARRAY_SIZE() in fc_log - Sort out a stale comment about races between fd alloc and dup2() - Fix return type of do_mount() from long to int - Various cosmetic cleanups for the lockref code Fixes: - Annotate spinning as unlikely() in __read_seqcount_begin The annotation already used to be there, but got lost in commit 52ac39e ("seqlock: seqcount_t: Implement all read APIs as statement expressions") - Fix proc_handler for sysctl_nr_open - Flush delayed work in delayed fput() - Fix grammar and spelling in propagate_umount() - Fix ESP not readable during coredump In /proc/PID/stat, there is the kstkesp field which is the stack pointer of a thread. While the thread is active, this field reads zero. But during a coredump, it should have a valid value However, at the moment, kstkesp is zero even during coredump - Don't wake up the writer if the pipe is still full - Fix unbalanced user_access_end() in select code" * tag 'vfs-6.14-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits) gfs2: use lockref_init for qd_lockref erofs: use lockref_init for pcl->lockref dcache: use lockref_init for d_lockref lockref: add a lockref_init helper lockref: drop superfluous externs lockref: use bool for false/true returns lockref: improve the lockref_get_not_zero description lockref: remove lockref_put_not_zero fs: Fix return type of do_mount() from long to int select: Fix unbalanced user_access_end() vbox: Enable VBOXGUEST and VBOXSF_FS on ARM64 pipe_read: don't wake up the writer if the pipe is still full selftests: coredump: Add stackdump test fs/proc: do_task_stat: Fix ESP not readable during coredump fs: add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag fs: sort out a stale comment about races between fd alloc and dup2 fs: Fix grammar and spelling in propagate_umount() fs: fc_log replace magic number 7 with ARRAY_SIZE() fs: use a consume fence in mnt_idmap() file: flush delayed work in delayed fput() ...
2 parents d582952 + c859df5 commit 4b84a4c

File tree

33 files changed

+415
-180
lines changed

33 files changed

+415
-180
lines changed

Documentation/filesystems/fiemap.rst

Lines changed: 15 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -12,21 +12,10 @@ returns a list of extents.
1212
Request Basics
1313
--------------
1414

15-
A fiemap request is encoded within struct fiemap::
16-
17-
struct fiemap {
18-
__u64 fm_start; /* logical offset (inclusive) at
19-
* which to start mapping (in) */
20-
__u64 fm_length; /* logical length of mapping which
21-
* userspace cares about (in) */
22-
__u32 fm_flags; /* FIEMAP_FLAG_* flags for request (in/out) */
23-
__u32 fm_mapped_extents; /* number of extents that were
24-
* mapped (out) */
25-
__u32 fm_extent_count; /* size of fm_extents array (in) */
26-
__u32 fm_reserved;
27-
struct fiemap_extent fm_extents[0]; /* array of mapped extents (out) */
28-
};
15+
A fiemap request is encoded within struct fiemap:
2916

17+
.. kernel-doc:: include/uapi/linux/fiemap.h
18+
:identifiers: fiemap
3019

3120
fm_start, and fm_length specify the logical range within the file
3221
which the process would like mappings for. Extents returned mirror
@@ -60,6 +49,8 @@ FIEMAP_FLAG_XATTR
6049
If this flag is set, the extents returned will describe the inodes
6150
extended attribute lookup tree, instead of its data tree.
6251

52+
FIEMAP_FLAG_CACHE
53+
This flag requests caching of the extents.
6354

6455
Extent Mapping
6556
--------------
@@ -77,18 +68,10 @@ complete the requested range and will not have the FIEMAP_EXTENT_LAST
7768
flag set (see the next section on extent flags).
7869

7970
Each extent is described by a single fiemap_extent structure as
80-
returned in fm_extents::
81-
82-
struct fiemap_extent {
83-
__u64 fe_logical; /* logical offset in bytes for the start of
84-
* the extent */
85-
__u64 fe_physical; /* physical offset in bytes for the start
86-
* of the extent */
87-
__u64 fe_length; /* length in bytes for the extent */
88-
__u64 fe_reserved64[2];
89-
__u32 fe_flags; /* FIEMAP_EXTENT_* flags for this extent */
90-
__u32 fe_reserved[3];
91-
};
71+
returned in fm_extents:
72+
73+
.. kernel-doc:: include/uapi/linux/fiemap.h
74+
:identifiers: fiemap_extent
9275

9376
All offsets and lengths are in bytes and mirror those on disk. It is valid
9477
for an extents logical offset to start before the request or its logical
@@ -175,6 +158,8 @@ FIEMAP_EXTENT_MERGED
175158
userspace would be highly inefficient, the kernel will try to merge most
176159
adjacent blocks into 'extents'.
177160

161+
FIEMAP_EXTENT_SHARED
162+
This flag is set to request that space be shared with other files.
178163

179164
VFS -> File System Implementation
180165
---------------------------------
@@ -191,14 +176,10 @@ each discovered extent::
191176
u64 len);
192177

193178
->fiemap is passed struct fiemap_extent_info which describes the
194-
fiemap request::
195-
196-
struct fiemap_extent_info {
197-
unsigned int fi_flags; /* Flags as passed from user */
198-
unsigned int fi_extents_mapped; /* Number of mapped extents */
199-
unsigned int fi_extents_max; /* Size of fiemap_extent array */
200-
struct fiemap_extent *fi_extents_start; /* Start of fiemap_extent array */
201-
};
179+
fiemap request:
180+
181+
.. kernel-doc:: include/linux/fiemap.h
182+
:identifiers: fiemap_extent_info
202183

203184
It is intended that the file system should not need to access any of this
204185
structure directly. Filesystem handlers should be tolerant to signals and return

drivers/virt/vboxguest/Kconfig

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# SPDX-License-Identifier: GPL-2.0-only
22
config VBOXGUEST
33
tristate "Virtual Box Guest integration support"
4-
depends on X86 && PCI && INPUT
4+
depends on (ARM64 || X86) && PCI && INPUT
55
help
66
This is a driver for the Virtual Box Guest PCI device used in
77
Virtual Box virtual machines. Enabling this driver will add

fs/dcache.c

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1681,9 +1681,8 @@ static struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name)
16811681
/* Make sure we always see the terminating NUL character */
16821682
smp_store_release(&dentry->d_name.name, dname); /* ^^^ */
16831683

1684-
dentry->d_lockref.count = 1;
16851684
dentry->d_flags = 0;
1686-
spin_lock_init(&dentry->d_lock);
1685+
lockref_init(&dentry->d_lockref, 1);
16871686
seqcount_spinlock_init(&dentry->d_seq, &dentry->d_lock);
16881687
dentry->d_inode = NULL;
16891688
dentry->d_parent = dentry;

fs/erofs/zdata.c

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -747,8 +747,7 @@ static int z_erofs_register_pcluster(struct z_erofs_decompress_frontend *fe)
747747
if (IS_ERR(pcl))
748748
return PTR_ERR(pcl);
749749

750-
spin_lock_init(&pcl->lockref.lock);
751-
pcl->lockref.count = 1; /* one ref for this request */
750+
lockref_init(&pcl->lockref, 1); /* one ref for this request */
752751
pcl->algorithmformat = map->m_algorithmformat;
753752
pcl->length = 0;
754753
pcl->partial = true;

fs/ext4/inode.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5006,10 +5006,11 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
50065006
if (IS_ENCRYPTED(inode)) {
50075007
inode->i_op = &ext4_encrypted_symlink_inode_operations;
50085008
} else if (ext4_inode_is_fast_symlink(inode)) {
5009-
inode->i_link = (char *)ei->i_data;
50105009
inode->i_op = &ext4_fast_symlink_inode_operations;
50115010
nd_terminate_link(ei->i_data, inode->i_size,
50125011
sizeof(ei->i_data) - 1);
5012+
inode_set_cached_link(inode, (char *)ei->i_data,
5013+
inode->i_size);
50135014
} else {
50145015
inode->i_op = &ext4_symlink_inode_operations;
50155016
}

fs/ext4/namei.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3418,7 +3418,6 @@ static int ext4_symlink(struct mnt_idmap *idmap, struct inode *dir,
34183418
inode->i_op = &ext4_symlink_inode_operations;
34193419
} else {
34203420
inode->i_op = &ext4_fast_symlink_inode_operations;
3421-
inode->i_link = (char *)&EXT4_I(inode)->i_data;
34223421
}
34233422
}
34243423

@@ -3434,6 +3433,9 @@ static int ext4_symlink(struct mnt_idmap *idmap, struct inode *dir,
34343433
disk_link.len);
34353434
inode->i_size = disk_link.len - 1;
34363435
EXT4_I(inode)->i_disksize = inode->i_size;
3436+
if (!IS_ENCRYPTED(inode))
3437+
inode_set_cached_link(inode, (char *)&EXT4_I(inode)->i_data,
3438+
inode->i_size);
34373439
}
34383440
err = ext4_add_nondir(handle, dentry, &inode);
34393441
if (handle)

fs/file.c

Lines changed: 7 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -279,17 +279,17 @@ static int expand_files(struct files_struct *files, unsigned int nr)
279279
if (nr < fdt->max_fds)
280280
return 0;
281281

282-
/* Can we expand? */
283-
if (nr >= sysctl_nr_open)
284-
return -EMFILE;
285-
286282
if (unlikely(files->resize_in_progress)) {
287283
spin_unlock(&files->file_lock);
288284
wait_event(files->resize_wait, !files->resize_in_progress);
289285
spin_lock(&files->file_lock);
290286
goto repeat;
291287
}
292288

289+
/* Can we expand? */
290+
if (unlikely(nr >= sysctl_nr_open))
291+
return -EMFILE;
292+
293293
/* All good, so we try */
294294
files->resize_in_progress = true;
295295
error = expand_fdtable(files, nr);
@@ -1231,17 +1231,9 @@ __releases(&files->file_lock)
12311231

12321232
/*
12331233
* We need to detect attempts to do dup2() over allocated but still
1234-
* not finished descriptor. NB: OpenBSD avoids that at the price of
1235-
* extra work in their equivalent of fget() - they insert struct
1236-
* file immediately after grabbing descriptor, mark it larval if
1237-
* more work (e.g. actual opening) is needed and make sure that
1238-
* fget() treats larval files as absent. Potentially interesting,
1239-
* but while extra work in fget() is trivial, locking implications
1240-
* and amount of surgery on open()-related paths in VFS are not.
1241-
* FreeBSD fails with -EBADF in the same situation, NetBSD "solution"
1242-
* deadlocks in rather amusing ways, AFAICS. All of that is out of
1243-
* scope of POSIX or SUS, since neither considers shared descriptor
1244-
* tables and this condition does not arise without those.
1234+
* not finished descriptor.
1235+
*
1236+
* POSIX is silent on the issue, we return -EBUSY.
12451237
*/
12461238
fdt = files_fdtable(files);
12471239
fd = array_index_nospec(fd, fdt->max_fds);

fs/file_table.c

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ static struct ctl_table fs_stat_sysctls[] = {
128128
.data = &sysctl_nr_open,
129129
.maxlen = sizeof(unsigned int),
130130
.mode = 0644,
131-
.proc_handler = proc_dointvec_minmax,
131+
.proc_handler = proc_douintvec_minmax,
132132
.extra1 = &sysctl_nr_open_min,
133133
.extra2 = &sysctl_nr_open_max,
134134
},
@@ -478,6 +478,8 @@ static void ____fput(struct callback_head *work)
478478
__fput(container_of(work, struct file, f_task_work));
479479
}
480480

481+
static DECLARE_DELAYED_WORK(delayed_fput_work, delayed_fput);
482+
481483
/*
482484
* If kernel thread really needs to have the final fput() it has done
483485
* to complete, call this. The only user right now is the boot - we
@@ -491,11 +493,10 @@ static void ____fput(struct callback_head *work)
491493
void flush_delayed_fput(void)
492494
{
493495
delayed_fput(NULL);
496+
flush_delayed_work(&delayed_fput_work);
494497
}
495498
EXPORT_SYMBOL_GPL(flush_delayed_fput);
496499

497-
static DECLARE_DELAYED_WORK(delayed_fput_work, delayed_fput);
498-
499500
void fput(struct file *file)
500501
{
501502
if (file_ref_put(&file->f_ref)) {

fs/fs_context.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -493,7 +493,7 @@ static void put_fc_log(struct fs_context *fc)
493493
if (log) {
494494
if (refcount_dec_and_test(&log->usage)) {
495495
fc->log.log = NULL;
496-
for (i = 0; i <= 7; i++)
496+
for (i = 0; i < ARRAY_SIZE(log->buffer) ; i++)
497497
if (log->need_free & (1 << i))
498498
kfree(log->buffer[i]);
499499
kfree(log);

fs/gfs2/quota.c

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -236,8 +236,7 @@ static struct gfs2_quota_data *qd_alloc(unsigned hash, struct gfs2_sbd *sdp, str
236236
return NULL;
237237

238238
qd->qd_sbd = sdp;
239-
qd->qd_lockref.count = 0;
240-
spin_lock_init(&qd->qd_lockref.lock);
239+
lockref_init(&qd->qd_lockref, 0);
241240
qd->qd_id = qid;
242241
qd->qd_slot = -1;
243242
INIT_LIST_HEAD(&qd->qd_lru);

0 commit comments

Comments
 (0)