Skip to content

Commit b5d760d

Browse files
committed
Merge tag 'vfs-6.17-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs iomap updates from Christian Brauner: - Refactor the iomap writeback code and split the generic and ioend/bio based writeback code. There are two methods that define the split between the generic writeback code, and the implemementation of it, and all knowledge of ioends and bios now sits below that layer. - Add fuse iomap support for buffered writes and dirty folio writeback. This is needed so that granular uptodate and dirty tracking can be used in fuse when large folios are enabled. This has two big advantages. For writes, instead of the entire folio needing to be read into the page cache, only the relevant portions need to be. For writeback, only the dirty portions need to be written back instead of the entire folio. * tag 'vfs-6.17-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fuse: refactor writeback to use iomap_writepage_ctx inode fuse: hook into iomap for invalidating and checking partial uptodateness fuse: use iomap for folio laundering fuse: use iomap for writeback fuse: use iomap for buffered writes iomap: build the writeback code without CONFIG_BLOCK iomap: add read_folio_range() handler for buffered writes iomap: improve argument passing to iomap_read_folio_sync iomap: replace iomap_folio_ops with iomap_write_ops iomap: export iomap_writeback_folio iomap: move folio_unlock out of iomap_writeback_folio iomap: rename iomap_writepage_map to iomap_writeback_folio iomap: move all ioend handling to ioend.c iomap: add public helpers for uptodate state manipulation iomap: hide ioends from the generic writeback code iomap: refactor the writeback interface iomap: cleanup the pending writeback tracking in iomap_writepage_map_blocks iomap: pass more arguments using the iomap writeback context iomap: header diet
2 parents 0965549 + d5212d8 commit b5d760d

File tree

27 files changed

+859
-805
lines changed

27 files changed

+859
-805
lines changed

Documentation/filesystems/iomap/design.rst

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -167,7 +167,6 @@ structure below:
167167
struct dax_device *dax_dev;
168168
void *inline_data;
169169
void *private;
170-
const struct iomap_folio_ops *folio_ops;
171170
u64 validity_cookie;
172171
};
173172
@@ -292,8 +291,6 @@ The fields are as follows:
292291
<https://lore.kernel.org/all/[email protected]/>`_.
293292
This value will be passed unchanged to ``->iomap_end``.
294293

295-
* ``folio_ops`` will be covered in the section on pagecache operations.
296-
297294
* ``validity_cookie`` is a magic freshness value set by the filesystem
298295
that should be used to detect stale mappings.
299296
For pagecache operations this is critical for correct operation

Documentation/filesystems/iomap/operations.rst

Lines changed: 28 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -57,21 +57,19 @@ The following address space operations can be wrapped easily:
5757
* ``bmap``
5858
* ``swap_activate``
5959

60-
``struct iomap_folio_ops``
60+
``struct iomap_write_ops``
6161
--------------------------
6262

63-
The ``->iomap_begin`` function for pagecache operations may set the
64-
``struct iomap::folio_ops`` field to an ops structure to override
65-
default behaviors of iomap:
66-
6763
.. code-block:: c
6864
69-
struct iomap_folio_ops {
65+
struct iomap_write_ops {
7066
struct folio *(*get_folio)(struct iomap_iter *iter, loff_t pos,
7167
unsigned len);
7268
void (*put_folio)(struct inode *inode, loff_t pos, unsigned copied,
7369
struct folio *folio);
7470
bool (*iomap_valid)(struct inode *inode, const struct iomap *iomap);
71+
int (*read_folio_range)(const struct iomap_iter *iter,
72+
struct folio *folio, loff_t pos, size_t len);
7573
};
7674
7775
iomap calls these functions:
@@ -127,6 +125,10 @@ iomap calls these functions:
127125
``->iomap_valid``, then the iomap should considered stale and the
128126
validation failed.
129127

128+
- ``read_folio_range``: Called to synchronously read in the range that will
129+
be written to. If this function is not provided, iomap will default to
130+
submitting a bio read request.
131+
130132
These ``struct kiocb`` flags are significant for buffered I/O with iomap:
131133

132134
* ``IOCB_NOWAIT``: Turns on ``IOMAP_NOWAIT``.
@@ -271,7 +273,7 @@ writeback.
271273
It does not lock ``i_rwsem`` or ``invalidate_lock``.
272274

273275
The dirty bit will be cleared for all folios run through the
274-
``->map_blocks`` machinery described below even if the writeback fails.
276+
``->writeback_range`` machinery described below even if the writeback fails.
275277
This is to prevent dirty folio clots when storage devices fail; an
276278
``-EIO`` is recorded for userspace to collect via ``fsync``.
277279

@@ -283,15 +285,14 @@ The ``ops`` structure must be specified and is as follows:
283285
.. code-block:: c
284286
285287
struct iomap_writeback_ops {
286-
int (*map_blocks)(struct iomap_writepage_ctx *wpc, struct inode *inode,
287-
loff_t offset, unsigned len);
288-
int (*submit_ioend)(struct iomap_writepage_ctx *wpc, int status);
289-
void (*discard_folio)(struct folio *folio, loff_t pos);
288+
int (*writeback_range)(struct iomap_writepage_ctx *wpc,
289+
struct folio *folio, u64 pos, unsigned int len, u64 end_pos);
290+
int (*writeback_submit)(struct iomap_writepage_ctx *wpc, int error);
290291
};
291292
292293
The fields are as follows:
293294

294-
- ``map_blocks``: Sets ``wpc->iomap`` to the space mapping of the file
295+
- ``writeback_range``: Sets ``wpc->iomap`` to the space mapping of the file
295296
range (in bytes) given by ``offset`` and ``len``.
296297
iomap calls this function for each dirty fs block in each dirty folio,
297298
though it will `reuse mappings
@@ -306,27 +307,26 @@ The fields are as follows:
306307
This revalidation must be open-coded by the filesystem; it is
307308
unclear if ``iomap::validity_cookie`` can be reused for this
308309
purpose.
309-
This function must be supplied by the filesystem.
310-
311-
- ``submit_ioend``: Allows the file systems to hook into writeback bio
312-
submission.
313-
This might include pre-write space accounting updates, or installing
314-
a custom ``->bi_end_io`` function for internal purposes, such as
315-
deferring the ioend completion to a workqueue to run metadata update
316-
transactions from process context before submitting the bio.
317-
This function is optional.
318310

319-
- ``discard_folio``: iomap calls this function after ``->map_blocks``
320-
fails to schedule I/O for any part of a dirty folio.
321-
The function should throw away any reservations that may have been
322-
made for the write.
311+
If this methods fails to schedule I/O for any part of a dirty folio, it
312+
should throw away any reservations that may have been made for the write.
323313
The folio will be marked clean and an ``-EIO`` recorded in the
324314
pagecache.
325315
Filesystems can use this callback to `remove
326316
<https://lore.kernel.org/all/[email protected]/>`_
327317
delalloc reservations to avoid having delalloc reservations for
328318
clean pagecache.
329-
This function is optional.
319+
This function must be supplied by the filesystem.
320+
321+
- ``writeback_submit``: Submit the previous built writeback context.
322+
Block based file systems should use the iomap_ioend_writeback_submit
323+
helper, other file system can implement their own.
324+
File systems can optionall to hook into writeback bio submission.
325+
This might include pre-write space accounting updates, or installing
326+
a custom ``->bi_end_io`` function for internal purposes, such as
327+
deferring the ioend completion to a workqueue to run metadata update
328+
transactions from process context before submitting the bio.
329+
This function must be supplied by the filesystem.
330330

331331
Pagecache Writeback Completion
332332
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -340,10 +340,9 @@ If the write failed, it will also set the error bits on the folios and
340340
the address space.
341341
This can happen in interrupt or process context, depending on the
342342
storage device.
343-
344343
Filesystems that need to update internal bookkeeping (e.g. unwritten
345-
extent conversions) should provide a ``->submit_ioend`` function to
346-
set ``struct iomap_end::bio::bi_end_io`` to its own function.
344+
extent conversions) should set their own bi_end_io on the bios
345+
submitted by ``->submit_writeback``
347346
This function should call ``iomap_finish_ioends`` after finishing its
348347
own work (e.g. unwritten extent conversion).
349348

block/fops.c

Lines changed: 25 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -540,30 +540,42 @@ static void blkdev_readahead(struct readahead_control *rac)
540540
iomap_readahead(rac, &blkdev_iomap_ops);
541541
}
542542

543-
static int blkdev_map_blocks(struct iomap_writepage_ctx *wpc,
544-
struct inode *inode, loff_t offset, unsigned int len)
543+
static ssize_t blkdev_writeback_range(struct iomap_writepage_ctx *wpc,
544+
struct folio *folio, u64 offset, unsigned int len, u64 end_pos)
545545
{
546-
loff_t isize = i_size_read(inode);
546+
loff_t isize = i_size_read(wpc->inode);
547547

548548
if (WARN_ON_ONCE(offset >= isize))
549549
return -EIO;
550-
if (offset >= wpc->iomap.offset &&
551-
offset < wpc->iomap.offset + wpc->iomap.length)
552-
return 0;
553-
return blkdev_iomap_begin(inode, offset, isize - offset,
554-
IOMAP_WRITE, &wpc->iomap, NULL);
550+
551+
if (offset < wpc->iomap.offset ||
552+
offset >= wpc->iomap.offset + wpc->iomap.length) {
553+
int error;
554+
555+
error = blkdev_iomap_begin(wpc->inode, offset, isize - offset,
556+
IOMAP_WRITE, &wpc->iomap, NULL);
557+
if (error)
558+
return error;
559+
}
560+
561+
return iomap_add_to_ioend(wpc, folio, offset, end_pos, len);
555562
}
556563

557564
static const struct iomap_writeback_ops blkdev_writeback_ops = {
558-
.map_blocks = blkdev_map_blocks,
565+
.writeback_range = blkdev_writeback_range,
566+
.writeback_submit = iomap_ioend_writeback_submit,
559567
};
560568

561569
static int blkdev_writepages(struct address_space *mapping,
562570
struct writeback_control *wbc)
563571
{
564-
struct iomap_writepage_ctx wpc = { };
572+
struct iomap_writepage_ctx wpc = {
573+
.inode = mapping->host,
574+
.wbc = wbc,
575+
.ops = &blkdev_writeback_ops
576+
};
565577

566-
return iomap_writepages(mapping, wbc, &wpc, &blkdev_writeback_ops);
578+
return iomap_writepages(&wpc);
567579
}
568580

569581
const struct address_space_operations def_blk_aops = {
@@ -714,7 +726,8 @@ blkdev_direct_write(struct kiocb *iocb, struct iov_iter *from)
714726

715727
static ssize_t blkdev_buffered_write(struct kiocb *iocb, struct iov_iter *from)
716728
{
717-
return iomap_file_buffered_write(iocb, from, &blkdev_iomap_ops, NULL);
729+
return iomap_file_buffered_write(iocb, from, &blkdev_iomap_ops, NULL,
730+
NULL);
718731
}
719732

720733
/*

fs/fuse/Kconfig

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
config FUSE_FS
33
tristate "FUSE (Filesystem in Userspace) support"
44
select FS_POSIX_ACL
5+
select FS_IOMAP
56
help
67
With FUSE it is possible to implement a fully functional filesystem
78
in a userspace program.

0 commit comments

Comments
 (0)