Skip to content

Commit 2f368b5

Browse files
committed
Merge patch series "refactor the iomap writeback code v5"
Christoph Hellwig <[email protected]> says: This is an alternative approach to the writeback part of the "fuse: use iomap for buffered writes + writeback" series from Joanne. The big difference compared to Joanne's version is that I hope the split between the generic and ioend/bio based writeback code is a bit cleaner here. We have two methods that define the split between the generic writeback code, and the implemementation of it, and all knowledge of ioends and bios now sits below that layer. This version passes testing on xfs, and gets as far as mainline for gfs2 (crashes in generic/361). * patches from https://lore.kernel.org/[email protected]: iomap: build the writeback code without CONFIG_BLOCK iomap: add read_folio_range() handler for buffered writes iomap: improve argument passing to iomap_read_folio_sync iomap: replace iomap_folio_ops with iomap_write_ops iomap: export iomap_writeback_folio iomap: move folio_unlock out of iomap_writeback_folio iomap: rename iomap_writepage_map to iomap_writeback_folio iomap: move all ioend handling to ioend.c iomap: add public helpers for uptodate state manipulation iomap: hide ioends from the generic writeback code iomap: refactor the writeback interface iomap: cleanup the pending writeback tracking in iomap_writepage_map_blocks iomap: pass more arguments using the iomap writeback context iomap: header diet Link: https://lore.kernel.org/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2 parents 19272b3 + 5699b7e commit 2f368b5

File tree

25 files changed

+705
-609
lines changed

25 files changed

+705
-609
lines changed

Documentation/filesystems/iomap/design.rst

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -167,7 +167,6 @@ structure below:
167167
struct dax_device *dax_dev;
168168
void *inline_data;
169169
void *private;
170-
const struct iomap_folio_ops *folio_ops;
171170
u64 validity_cookie;
172171
};
173172
@@ -292,8 +291,6 @@ The fields are as follows:
292291
<https://lore.kernel.org/all/[email protected]/>`_.
293292
This value will be passed unchanged to ``->iomap_end``.
294293

295-
* ``folio_ops`` will be covered in the section on pagecache operations.
296-
297294
* ``validity_cookie`` is a magic freshness value set by the filesystem
298295
that should be used to detect stale mappings.
299296
For pagecache operations this is critical for correct operation

Documentation/filesystems/iomap/operations.rst

Lines changed: 28 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -57,21 +57,19 @@ The following address space operations can be wrapped easily:
5757
* ``bmap``
5858
* ``swap_activate``
5959

60-
``struct iomap_folio_ops``
60+
``struct iomap_write_ops``
6161
--------------------------
6262

63-
The ``->iomap_begin`` function for pagecache operations may set the
64-
``struct iomap::folio_ops`` field to an ops structure to override
65-
default behaviors of iomap:
66-
6763
.. code-block:: c
6864
69-
struct iomap_folio_ops {
65+
struct iomap_write_ops {
7066
struct folio *(*get_folio)(struct iomap_iter *iter, loff_t pos,
7167
unsigned len);
7268
void (*put_folio)(struct inode *inode, loff_t pos, unsigned copied,
7369
struct folio *folio);
7470
bool (*iomap_valid)(struct inode *inode, const struct iomap *iomap);
71+
int (*read_folio_range)(const struct iomap_iter *iter,
72+
struct folio *folio, loff_t pos, size_t len);
7573
};
7674
7775
iomap calls these functions:
@@ -127,6 +125,10 @@ iomap calls these functions:
127125
``->iomap_valid``, then the iomap should considered stale and the
128126
validation failed.
129127

128+
- ``read_folio_range``: Called to synchronously read in the range that will
129+
be written to. If this function is not provided, iomap will default to
130+
submitting a bio read request.
131+
130132
These ``struct kiocb`` flags are significant for buffered I/O with iomap:
131133

132134
* ``IOCB_NOWAIT``: Turns on ``IOMAP_NOWAIT``.
@@ -271,7 +273,7 @@ writeback.
271273
It does not lock ``i_rwsem`` or ``invalidate_lock``.
272274

273275
The dirty bit will be cleared for all folios run through the
274-
``->map_blocks`` machinery described below even if the writeback fails.
276+
``->writeback_range`` machinery described below even if the writeback fails.
275277
This is to prevent dirty folio clots when storage devices fail; an
276278
``-EIO`` is recorded for userspace to collect via ``fsync``.
277279

@@ -283,15 +285,14 @@ The ``ops`` structure must be specified and is as follows:
283285
.. code-block:: c
284286
285287
struct iomap_writeback_ops {
286-
int (*map_blocks)(struct iomap_writepage_ctx *wpc, struct inode *inode,
287-
loff_t offset, unsigned len);
288-
int (*submit_ioend)(struct iomap_writepage_ctx *wpc, int status);
289-
void (*discard_folio)(struct folio *folio, loff_t pos);
288+
int (*writeback_range)(struct iomap_writepage_ctx *wpc,
289+
struct folio *folio, u64 pos, unsigned int len, u64 end_pos);
290+
int (*writeback_submit)(struct iomap_writepage_ctx *wpc, int error);
290291
};
291292
292293
The fields are as follows:
293294

294-
- ``map_blocks``: Sets ``wpc->iomap`` to the space mapping of the file
295+
- ``writeback_range``: Sets ``wpc->iomap`` to the space mapping of the file
295296
range (in bytes) given by ``offset`` and ``len``.
296297
iomap calls this function for each dirty fs block in each dirty folio,
297298
though it will `reuse mappings
@@ -306,27 +307,26 @@ The fields are as follows:
306307
This revalidation must be open-coded by the filesystem; it is
307308
unclear if ``iomap::validity_cookie`` can be reused for this
308309
purpose.
309-
This function must be supplied by the filesystem.
310-
311-
- ``submit_ioend``: Allows the file systems to hook into writeback bio
312-
submission.
313-
This might include pre-write space accounting updates, or installing
314-
a custom ``->bi_end_io`` function for internal purposes, such as
315-
deferring the ioend completion to a workqueue to run metadata update
316-
transactions from process context before submitting the bio.
317-
This function is optional.
318310

319-
- ``discard_folio``: iomap calls this function after ``->map_blocks``
320-
fails to schedule I/O for any part of a dirty folio.
321-
The function should throw away any reservations that may have been
322-
made for the write.
311+
If this methods fails to schedule I/O for any part of a dirty folio, it
312+
should throw away any reservations that may have been made for the write.
323313
The folio will be marked clean and an ``-EIO`` recorded in the
324314
pagecache.
325315
Filesystems can use this callback to `remove
326316
<https://lore.kernel.org/all/[email protected]/>`_
327317
delalloc reservations to avoid having delalloc reservations for
328318
clean pagecache.
329-
This function is optional.
319+
This function must be supplied by the filesystem.
320+
321+
- ``writeback_submit``: Submit the previous built writeback context.
322+
Block based file systems should use the iomap_ioend_writeback_submit
323+
helper, other file system can implement their own.
324+
File systems can optionall to hook into writeback bio submission.
325+
This might include pre-write space accounting updates, or installing
326+
a custom ``->bi_end_io`` function for internal purposes, such as
327+
deferring the ioend completion to a workqueue to run metadata update
328+
transactions from process context before submitting the bio.
329+
This function must be supplied by the filesystem.
330330

331331
Pagecache Writeback Completion
332332
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -340,10 +340,9 @@ If the write failed, it will also set the error bits on the folios and
340340
the address space.
341341
This can happen in interrupt or process context, depending on the
342342
storage device.
343-
344343
Filesystems that need to update internal bookkeeping (e.g. unwritten
345-
extent conversions) should provide a ``->submit_ioend`` function to
346-
set ``struct iomap_end::bio::bi_end_io`` to its own function.
344+
extent conversions) should set their own bi_end_io on the bios
345+
submitted by ``->submit_writeback``
347346
This function should call ``iomap_finish_ioends`` after finishing its
348347
own work (e.g. unwritten extent conversion).
349348

block/fops.c

Lines changed: 25 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -537,30 +537,42 @@ static void blkdev_readahead(struct readahead_control *rac)
537537
iomap_readahead(rac, &blkdev_iomap_ops);
538538
}
539539

540-
static int blkdev_map_blocks(struct iomap_writepage_ctx *wpc,
541-
struct inode *inode, loff_t offset, unsigned int len)
540+
static ssize_t blkdev_writeback_range(struct iomap_writepage_ctx *wpc,
541+
struct folio *folio, u64 offset, unsigned int len, u64 end_pos)
542542
{
543-
loff_t isize = i_size_read(inode);
543+
loff_t isize = i_size_read(wpc->inode);
544544

545545
if (WARN_ON_ONCE(offset >= isize))
546546
return -EIO;
547-
if (offset >= wpc->iomap.offset &&
548-
offset < wpc->iomap.offset + wpc->iomap.length)
549-
return 0;
550-
return blkdev_iomap_begin(inode, offset, isize - offset,
551-
IOMAP_WRITE, &wpc->iomap, NULL);
547+
548+
if (offset < wpc->iomap.offset ||
549+
offset >= wpc->iomap.offset + wpc->iomap.length) {
550+
int error;
551+
552+
error = blkdev_iomap_begin(wpc->inode, offset, isize - offset,
553+
IOMAP_WRITE, &wpc->iomap, NULL);
554+
if (error)
555+
return error;
556+
}
557+
558+
return iomap_add_to_ioend(wpc, folio, offset, end_pos, len);
552559
}
553560

554561
static const struct iomap_writeback_ops blkdev_writeback_ops = {
555-
.map_blocks = blkdev_map_blocks,
562+
.writeback_range = blkdev_writeback_range,
563+
.writeback_submit = iomap_ioend_writeback_submit,
556564
};
557565

558566
static int blkdev_writepages(struct address_space *mapping,
559567
struct writeback_control *wbc)
560568
{
561-
struct iomap_writepage_ctx wpc = { };
569+
struct iomap_writepage_ctx wpc = {
570+
.inode = mapping->host,
571+
.wbc = wbc,
572+
.ops = &blkdev_writeback_ops
573+
};
562574

563-
return iomap_writepages(mapping, wbc, &wpc, &blkdev_writeback_ops);
575+
return iomap_writepages(&wpc);
564576
}
565577

566578
const struct address_space_operations def_blk_aops = {
@@ -711,7 +723,8 @@ blkdev_direct_write(struct kiocb *iocb, struct iov_iter *from)
711723

712724
static ssize_t blkdev_buffered_write(struct kiocb *iocb, struct iov_iter *from)
713725
{
714-
return iomap_file_buffered_write(iocb, from, &blkdev_iomap_ops, NULL);
726+
return iomap_file_buffered_write(iocb, from, &blkdev_iomap_ops, NULL,
727+
NULL);
715728
}
716729

717730
/*

fs/gfs2/aops.c

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -159,7 +159,11 @@ static int gfs2_writepages(struct address_space *mapping,
159159
struct writeback_control *wbc)
160160
{
161161
struct gfs2_sbd *sdp = gfs2_mapping2sbd(mapping);
162-
struct iomap_writepage_ctx wpc = { };
162+
struct iomap_writepage_ctx wpc = {
163+
.inode = mapping->host,
164+
.wbc = wbc,
165+
.ops = &gfs2_writeback_ops,
166+
};
163167
int ret;
164168

165169
/*
@@ -168,7 +172,7 @@ static int gfs2_writepages(struct address_space *mapping,
168172
* want balance_dirty_pages() to loop indefinitely trying to write out
169173
* pages held in the ail that it can't find.
170174
*/
171-
ret = iomap_writepages(mapping, wbc, &wpc, &gfs2_writeback_ops);
175+
ret = iomap_writepages(&wpc);
172176
if (ret == 0 && wbc->nr_to_write > 0)
173177
set_bit(SDF_FORCE_AIL_FLUSH, &sdp->sd_flags);
174178
return ret;

fs/gfs2/bmap.c

Lines changed: 27 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -963,12 +963,16 @@ static struct folio *
963963
gfs2_iomap_get_folio(struct iomap_iter *iter, loff_t pos, unsigned len)
964964
{
965965
struct inode *inode = iter->inode;
966+
struct gfs2_inode *ip = GFS2_I(inode);
966967
unsigned int blockmask = i_blocksize(inode) - 1;
967968
struct gfs2_sbd *sdp = GFS2_SB(inode);
968969
unsigned int blocks;
969970
struct folio *folio;
970971
int status;
971972

973+
if (!gfs2_is_jdata(ip) && !gfs2_is_stuffed(ip))
974+
return iomap_get_folio(iter, pos, len);
975+
972976
blocks = ((pos & blockmask) + len + blockmask) >> inode->i_blkbits;
973977
status = gfs2_trans_begin(sdp, RES_DINODE + blocks, 0);
974978
if (status)
@@ -987,21 +991,22 @@ static void gfs2_iomap_put_folio(struct inode *inode, loff_t pos,
987991
struct gfs2_inode *ip = GFS2_I(inode);
988992
struct gfs2_sbd *sdp = GFS2_SB(inode);
989993

990-
if (!gfs2_is_stuffed(ip))
994+
if (gfs2_is_jdata(ip) && !gfs2_is_stuffed(ip))
991995
gfs2_trans_add_databufs(ip->i_gl, folio,
992996
offset_in_folio(folio, pos),
993997
copied);
994998

995999
folio_unlock(folio);
9961000
folio_put(folio);
9971001

998-
if (tr->tr_num_buf_new)
999-
__mark_inode_dirty(inode, I_DIRTY_DATASYNC);
1000-
1001-
gfs2_trans_end(sdp);
1002+
if (gfs2_is_jdata(ip) || gfs2_is_stuffed(ip)) {
1003+
if (tr->tr_num_buf_new)
1004+
__mark_inode_dirty(inode, I_DIRTY_DATASYNC);
1005+
gfs2_trans_end(sdp);
1006+
}
10021007
}
10031008

1004-
static const struct iomap_folio_ops gfs2_iomap_folio_ops = {
1009+
const struct iomap_write_ops gfs2_iomap_write_ops = {
10051010
.get_folio = gfs2_iomap_get_folio,
10061011
.put_folio = gfs2_iomap_put_folio,
10071012
};
@@ -1078,8 +1083,6 @@ static int gfs2_iomap_begin_write(struct inode *inode, loff_t pos,
10781083
gfs2_trans_end(sdp);
10791084
}
10801085

1081-
if (gfs2_is_stuffed(ip) || gfs2_is_jdata(ip))
1082-
iomap->folio_ops = &gfs2_iomap_folio_ops;
10831086
return 0;
10841087

10851088
out_trans_end:
@@ -1304,7 +1307,7 @@ static int gfs2_block_zero_range(struct inode *inode, loff_t from, loff_t length
13041307
return 0;
13051308
length = min(length, inode->i_size - from);
13061309
return iomap_zero_range(inode, from, length, NULL, &gfs2_iomap_ops,
1307-
NULL);
1310+
&gfs2_iomap_write_ops, NULL);
13081311
}
13091312

13101313
#define GFS2_JTRUNC_REVOKES 8192
@@ -2469,23 +2472,26 @@ int __gfs2_punch_hole(struct file *file, loff_t offset, loff_t length)
24692472
return error;
24702473
}
24712474

2472-
static int gfs2_map_blocks(struct iomap_writepage_ctx *wpc, struct inode *inode,
2473-
loff_t offset, unsigned int len)
2475+
static ssize_t gfs2_writeback_range(struct iomap_writepage_ctx *wpc,
2476+
struct folio *folio, u64 offset, unsigned int len, u64 end_pos)
24742477
{
2475-
int ret;
2476-
2477-
if (WARN_ON_ONCE(gfs2_is_stuffed(GFS2_I(inode))))
2478+
if (WARN_ON_ONCE(gfs2_is_stuffed(GFS2_I(wpc->inode))))
24782479
return -EIO;
24792480

2480-
if (offset >= wpc->iomap.offset &&
2481-
offset < wpc->iomap.offset + wpc->iomap.length)
2482-
return 0;
2481+
if (offset < wpc->iomap.offset ||
2482+
offset >= wpc->iomap.offset + wpc->iomap.length) {
2483+
int ret;
24832484

2484-
memset(&wpc->iomap, 0, sizeof(wpc->iomap));
2485-
ret = gfs2_iomap_get(inode, offset, INT_MAX, &wpc->iomap);
2486-
return ret;
2485+
memset(&wpc->iomap, 0, sizeof(wpc->iomap));
2486+
ret = gfs2_iomap_get(wpc->inode, offset, INT_MAX, &wpc->iomap);
2487+
if (ret)
2488+
return ret;
2489+
}
2490+
2491+
return iomap_add_to_ioend(wpc, folio, offset, end_pos, len);
24872492
}
24882493

24892494
const struct iomap_writeback_ops gfs2_writeback_ops = {
2490-
.map_blocks = gfs2_map_blocks,
2495+
.writeback_range = gfs2_writeback_range,
2496+
.writeback_submit = iomap_ioend_writeback_submit,
24912497
};

fs/gfs2/bmap.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ static inline void gfs2_write_calc_reserv(const struct gfs2_inode *ip,
4444
}
4545

4646
extern const struct iomap_ops gfs2_iomap_ops;
47+
extern const struct iomap_write_ops gfs2_iomap_write_ops;
4748
extern const struct iomap_writeback_ops gfs2_writeback_ops;
4849

4950
int gfs2_unstuff_dinode(struct gfs2_inode *ip);

fs/gfs2/file.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1058,7 +1058,8 @@ static ssize_t gfs2_file_buffered_write(struct kiocb *iocb,
10581058
}
10591059

10601060
pagefault_disable();
1061-
ret = iomap_file_buffered_write(iocb, from, &gfs2_iomap_ops, NULL);
1061+
ret = iomap_file_buffered_write(iocb, from, &gfs2_iomap_ops,
1062+
&gfs2_iomap_write_ops, NULL);
10621063
pagefault_enable();
10631064
if (ret > 0)
10641065
written += ret;

fs/iomap/Makefile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,9 @@ ccflags-y += -I $(src) # needed for trace events
99
obj-$(CONFIG_FS_IOMAP) += iomap.o
1010

1111
iomap-y += trace.o \
12-
iter.o
13-
iomap-$(CONFIG_BLOCK) += buffered-io.o \
14-
direct-io.o \
12+
iter.o \
13+
buffered-io.o
14+
iomap-$(CONFIG_BLOCK) += direct-io.o \
1515
ioend.o \
1616
fiemap.o \
1717
seek.o

0 commit comments

Comments
 (0)