Skip to content

Commit 2be7d34

Browse files
committed
Revert "vfs: properly and reliably lock f_pos in fdget_pos()"
This reverts commit 0be0ee7. I was hoping it would be benign to switch over entirely to FMODE_STREAM, and we'd have just a couple of small fixups we'd need, but it looks like we're not quite there yet. While it worked fine on both my desktop and laptop, they are fairly similar in other respects, and run mostly the same loads. Kenneth Crudup reports that it seems to break both his vmware installation and the KDE upower service. In both cases apparently leading to timeouts due to waitinmg for the f_pos lock. There are a number of character devices in particular that definitely want stream-like behavior, but that currently don't get marked as streams, and as a result get the exclusion between concurrent read()/write() on the same file descriptor. Which doesn't work well for them. The most obvious example if this is /dev/console and /dev/tty, which use console_fops and tty_fops respectively (and ptmx_fops for the pty master side). It may be that it's just this that causes problems, but we clearly weren't ready yet. Because there's a number of other likely common cases that don't have llseek implementations and would seem to act as stream devices: /dev/fuse (fuse_dev_operations) /dev/mcelog (mce_chrdev_ops) /dev/mei0 (mei_fops) /dev/net/tun (tun_fops) /dev/nvme0 (nvme_dev_fops) /dev/tpm0 (tpm_fops) /proc/self/ns/mnt (ns_file_operations) /dev/snd/pcm* (snd_pcm_f_ops[]) and while some of these could be trivially automatically detected by the vfs layer when the character device is opened by just noticing that they have no read or write operations either, it often isn't that obvious. Some character devices most definitely do use the file position, even if they don't allow seeking: the firmware update code, for example, uses simple_read_from_buffer() that does use f_pos, but doesn't allow seeking back and forth. We'll revisit this when there's a better way to detect the problem and fix it (possibly with a coccinelle script to do more of the FMODE_STREAM annotations). Reported-by: Kenneth R. Crudup <[email protected]> Cc: Kirill Smelkov <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent ab851d4 commit 2be7d34

File tree

3 files changed

+8
-2
lines changed

3 files changed

+8
-2
lines changed

fs/file.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -795,7 +795,7 @@ unsigned long __fdget_pos(unsigned int fd)
795795
unsigned long v = __fdget(fd);
796796
struct file *file = (struct file *)(v & ~3);
797797

798-
if (file && !(file->f_mode & FMODE_STREAM)) {
798+
if (file && (file->f_mode & FMODE_ATOMIC_POS)) {
799799
if (file_count(file) > 1) {
800800
v |= FDPUT_POS_UNLOCK;
801801
mutex_lock(&file->f_pos_lock);

fs/open.c

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -771,6 +771,10 @@ static int do_dentry_open(struct file *f,
771771
f->f_mode |= FMODE_WRITER;
772772
}
773773

774+
/* POSIX.1-2008/SUSv4 Section XSI 2.9.7 */
775+
if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode))
776+
f->f_mode |= FMODE_ATOMIC_POS;
777+
774778
f->f_op = fops_get(inode->i_fop);
775779
if (WARN_ON(!f->f_op)) {
776780
error = -ENODEV;
@@ -1252,7 +1256,7 @@ EXPORT_SYMBOL(nonseekable_open);
12521256
*/
12531257
int stream_open(struct inode *inode, struct file *filp)
12541258
{
1255-
filp->f_mode &= ~(FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE);
1259+
filp->f_mode &= ~(FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE | FMODE_ATOMIC_POS);
12561260
filp->f_mode |= FMODE_STREAM;
12571261
return 0;
12581262
}

include/linux/fs.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -148,6 +148,8 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
148148
/* File is opened with O_PATH; almost nothing can be done with it */
149149
#define FMODE_PATH ((__force fmode_t)0x4000)
150150

151+
/* File needs atomic accesses to f_pos */
152+
#define FMODE_ATOMIC_POS ((__force fmode_t)0x8000)
151153
/* Write access to underlying fs */
152154
#define FMODE_WRITER ((__force fmode_t)0x10000)
153155
/* Has read method(s) */

0 commit comments

Comments
 (0)