Skip to content

Commit 1535240

Browse files
d-hansMiklos Szeredi
authored andcommitted
fuse: allow non-extending parallel direct writes on the same file
In general, as of now, in FUSE, direct writes on the same file are serialized over inode lock i.e we hold inode lock for the full duration of the write request. I could not find in fuse code and git history a comment which clearly explains why this exclusive lock is taken for direct writes. Following might be the reasons for acquiring an exclusive lock but not be limited to 1) Our guess is some USER space fuse implementations might be relying on this lock for serialization. 2) The lock protects against file read/write size races. 3) Ruling out any issues arising from partial write failures. This patch relaxes the exclusive lock for direct non-extending writes only. File size extending writes might not need the lock either, but we are not entirely sure if there is a risk to introduce any kind of regression. Furthermore, benchmarking with fio does not show a difference between patch versions that take on file size extension a) an exclusive lock and b) a shared lock. A possible example of an issue with i_size extending writes are write error cases. Some writes might succeed and others might fail for file system internal reasons - for example ENOSPACE. With parallel file size extending writes it _might_ be difficult to revert the action of the failing write, especially to restore the right i_size. With these changes, we allow non-extending parallel direct writes on the same file with the help of a flag called FOPEN_PARALLEL_DIRECT_WRITES. If this flag is set on the file (flag is passed from libfuse to fuse kernel as part of file open/create), we do not take exclusive lock anymore, but instead use a shared lock that allows non-extending writes to run in parallel. FUSE implementations which rely on this inode lock for serialization can continue to do so and serialized direct writes are still the default. Implementations that do not do write serialization need to be updated and need to set the FOPEN_PARALLEL_DIRECT_WRITES flag in their file open/create reply. On patch review there were concerns that network file systems (or vfs multiple mounts of the same file system) might have issues with parallel writes. We believe this is not the case, as this is just a local lock, which network file systems could not rely on anyway. I.e. this lock is just for local consistency. Signed-off-by: Dharmendra Singh <[email protected]> Signed-off-by: Bernd Schubert <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
1 parent e2283a7 commit 1535240

File tree

2 files changed

+43
-3
lines changed

2 files changed

+43
-3
lines changed

fs/fuse/file.c

Lines changed: 40 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1563,14 +1563,47 @@ static ssize_t fuse_direct_read_iter(struct kiocb *iocb, struct iov_iter *to)
15631563
return res;
15641564
}
15651565

1566+
static bool fuse_direct_write_extending_i_size(struct kiocb *iocb,
1567+
struct iov_iter *iter)
1568+
{
1569+
struct inode *inode = file_inode(iocb->ki_filp);
1570+
1571+
return iocb->ki_pos + iov_iter_count(iter) > i_size_read(inode);
1572+
}
1573+
15661574
static ssize_t fuse_direct_write_iter(struct kiocb *iocb, struct iov_iter *from)
15671575
{
15681576
struct inode *inode = file_inode(iocb->ki_filp);
1577+
struct file *file = iocb->ki_filp;
1578+
struct fuse_file *ff = file->private_data;
15691579
struct fuse_io_priv io = FUSE_IO_PRIV_SYNC(iocb);
15701580
ssize_t res;
1581+
bool exclusive_lock =
1582+
!(ff->open_flags & FOPEN_PARALLEL_DIRECT_WRITES) ||
1583+
iocb->ki_flags & IOCB_APPEND ||
1584+
fuse_direct_write_extending_i_size(iocb, from);
1585+
1586+
/*
1587+
* Take exclusive lock if
1588+
* - Parallel direct writes are disabled - a user space decision
1589+
* - Parallel direct writes are enabled and i_size is being extended.
1590+
* This might not be needed at all, but needs further investigation.
1591+
*/
1592+
if (exclusive_lock)
1593+
inode_lock(inode);
1594+
else {
1595+
inode_lock_shared(inode);
1596+
1597+
/* A race with truncate might have come up as the decision for
1598+
* the lock type was done without holding the lock, check again.
1599+
*/
1600+
if (fuse_direct_write_extending_i_size(iocb, from)) {
1601+
inode_unlock_shared(inode);
1602+
inode_lock(inode);
1603+
exclusive_lock = true;
1604+
}
1605+
}
15711606

1572-
/* Don't allow parallel writes to the same file */
1573-
inode_lock(inode);
15741607
res = generic_write_checks(iocb, from);
15751608
if (res > 0) {
15761609
if (!is_sync_kiocb(iocb) && iocb->ki_flags & IOCB_DIRECT) {
@@ -1581,7 +1614,10 @@ static ssize_t fuse_direct_write_iter(struct kiocb *iocb, struct iov_iter *from)
15811614
fuse_write_update_attr(inode, iocb->ki_pos, res);
15821615
}
15831616
}
1584-
inode_unlock(inode);
1617+
if (exclusive_lock)
1618+
inode_unlock(inode);
1619+
else
1620+
inode_unlock_shared(inode);
15851621

15861622
return res;
15871623
}
@@ -2931,6 +2967,7 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
29312967

29322968
if (iov_iter_rw(iter) == WRITE) {
29332969
fuse_write_update_attr(inode, pos, ret);
2970+
/* For extending writes we already hold exclusive lock */
29342971
if (ret < 0 && offset + count > i_size)
29352972
fuse_do_truncate(file);
29362973
}

include/uapi/linux/fuse.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -200,6 +200,7 @@
200200
*
201201
* 7.38
202202
* - add FUSE_EXPIRE_ONLY flag to fuse_notify_inval_entry
203+
* - add FOPEN_PARALLEL_DIRECT_WRITES
203204
*/
204205

205206
#ifndef _LINUX_FUSE_H
@@ -307,13 +308,15 @@ struct fuse_file_lock {
307308
* FOPEN_CACHE_DIR: allow caching this directory
308309
* FOPEN_STREAM: the file is stream-like (no file position at all)
309310
* FOPEN_NOFLUSH: don't flush data cache on close (unless FUSE_WRITEBACK_CACHE)
311+
* FOPEN_PARALLEL_DIRECT_WRITES: Allow concurrent direct writes on the same inode
310312
*/
311313
#define FOPEN_DIRECT_IO (1 << 0)
312314
#define FOPEN_KEEP_CACHE (1 << 1)
313315
#define FOPEN_NONSEEKABLE (1 << 2)
314316
#define FOPEN_CACHE_DIR (1 << 3)
315317
#define FOPEN_STREAM (1 << 4)
316318
#define FOPEN_NOFLUSH (1 << 5)
319+
#define FOPEN_PARALLEL_DIRECT_WRITES (1 << 6)
317320

318321
/**
319322
* INIT request/reply flags

0 commit comments

Comments
 (0)