Skip to content

Commit 4390f01

Browse files
Brian Fostercmaiolino
authored andcommitted
xfs: don't free cowblocks from under dirty pagecache on unshare
fallocate unshare mode explicitly breaks extent sharing. When a command completes, it checks the data fork for any remaining shared extents to determine whether the reflink inode flag and COW fork preallocation can be removed. This logic doesn't consider in-core pagecache and I/O state, however, which means we can unsafely remove COW fork blocks that are still needed under certain conditions. For example, consider the following command sequence: xfs_io -fc "pwrite 0 1k" -c "reflink <file> 0 256k 1k" \ -c "pwrite 0 32k" -c "funshare 0 1k" <file> This allocates a data block at offset 0, shares it, and then overwrites it with a larger buffered write. The overwrite triggers COW fork preallocation, 32 blocks by default, which maps the entire 32k write to delalloc in the COW fork. All but the shared block at offset 0 remains hole mapped in the data fork. The unshare command redirties and flushes the folio at offset 0, removing the only shared extent from the inode. Since the inode no longer maps shared extents, unshare purges the COW fork before the remaining 28k may have written back. This leaves dirty pagecache backed by holes, which writeback quietly skips, thus leaving clean, non-zeroed pagecache over holes in the file. To verify, fiemap shows holes in the first 32k of the file and reads return different data across a remount: $ xfs_io -c "fiemap -v" <file> <file>: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS ... 1: [8..511]: hole 504 ... $ xfs_io -c "pread -v 4k 8" <file> 00001000: cd cd cd cd cd cd cd cd ........ $ umount <mnt>; mount <dev> <mnt> $ xfs_io -c "pread -v 4k 8" <file> 00001000: 00 00 00 00 00 00 00 00 ........ To avoid this problem, make unshare follow the same rules used for background cowblock scanning and never purge the COW fork for inodes with dirty pagecache or in-flight I/O. Fixes: 46afb06 ("xfs: only flush the unshared range in xfs_reflink_unshare") Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]>
1 parent 90a71da commit 4390f01

File tree

3 files changed

+23
-7
lines changed

3 files changed

+23
-7
lines changed

fs/xfs/xfs_icache.c

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1317,13 +1317,7 @@ xfs_prep_free_cowblocks(
13171317
*/
13181318
if (!sync && inode_is_open_for_write(VFS_I(ip)))
13191319
return false;
1320-
if ((VFS_I(ip)->i_state & I_DIRTY_PAGES) ||
1321-
mapping_tagged(VFS_I(ip)->i_mapping, PAGECACHE_TAG_DIRTY) ||
1322-
mapping_tagged(VFS_I(ip)->i_mapping, PAGECACHE_TAG_WRITEBACK) ||
1323-
atomic_read(&VFS_I(ip)->i_dio_count))
1324-
return false;
1325-
1326-
return true;
1320+
return xfs_can_free_cowblocks(ip);
13271321
}
13281322

13291323
/*

fs/xfs/xfs_reflink.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1595,6 +1595,9 @@ xfs_reflink_clear_inode_flag(
15951595

15961596
ASSERT(xfs_is_reflink_inode(ip));
15971597

1598+
if (!xfs_can_free_cowblocks(ip))
1599+
return 0;
1600+
15981601
error = xfs_reflink_inode_has_shared_extents(*tpp, ip, &needs_flag);
15991602
if (error || needs_flag)
16001603
return error;

fs/xfs/xfs_reflink.h

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,25 @@
66
#ifndef __XFS_REFLINK_H
77
#define __XFS_REFLINK_H 1
88

9+
/*
10+
* Check whether it is safe to free COW fork blocks from an inode. It is unsafe
11+
* to do so when an inode has dirty cache or I/O in-flight, even if no shared
12+
* extents exist in the data fork, because outstanding I/O may target blocks
13+
* that were speculatively allocated to the COW fork.
14+
*/
15+
static inline bool
16+
xfs_can_free_cowblocks(struct xfs_inode *ip)
17+
{
18+
struct inode *inode = VFS_I(ip);
19+
20+
if ((inode->i_state & I_DIRTY_PAGES) ||
21+
mapping_tagged(inode->i_mapping, PAGECACHE_TAG_DIRTY) ||
22+
mapping_tagged(inode->i_mapping, PAGECACHE_TAG_WRITEBACK) ||
23+
atomic_read(&inode->i_dio_count))
24+
return false;
25+
return true;
26+
}
27+
928
extern int xfs_reflink_trim_around_shared(struct xfs_inode *ip,
1029
struct xfs_bmbt_irec *irec, bool *shared);
1130
int xfs_bmap_trim_cow(struct xfs_inode *ip, struct xfs_bmbt_irec *imap,

0 commit comments

Comments
 (0)