Skip to content

Commit fa91e4a

Browse files
adam900710kdave
authored andcommitted
btrfs: qgroup: fix data leak caused by race between writeback and truncate
[BUG] When running tests like generic/013 on test device with btrfs quota enabled, it can normally lead to data leak, detected at unmount time: BTRFS warning (device dm-3): qgroup 0/5 has unreleased space, type 0 rsv 4096 ------------[ cut here ]------------ WARNING: CPU: 11 PID: 16386 at fs/btrfs/disk-io.c:4142 close_ctree+0x1dc/0x323 [btrfs] RIP: 0010:close_ctree+0x1dc/0x323 [btrfs] Call Trace: btrfs_put_super+0x15/0x17 [btrfs] generic_shutdown_super+0x72/0x110 kill_anon_super+0x18/0x30 btrfs_kill_super+0x17/0x30 [btrfs] deactivate_locked_super+0x3b/0xa0 deactivate_super+0x40/0x50 cleanup_mnt+0x135/0x190 __cleanup_mnt+0x12/0x20 task_work_run+0x64/0xb0 __prepare_exit_to_usermode+0x1bc/0x1c0 __syscall_return_slowpath+0x47/0x230 do_syscall_64+0x64/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ---[ end trace caf08beafeca2392 ]--- BTRFS error (device dm-3): qgroup reserved space leaked [CAUSE] In the offending case, the offending operations are: 2/6: writev f2X[269 1 0 0 0 0] [1006997,67,288] 0 2/7: truncate f2X[269 1 0 0 48 1026293] 18388 0 The following sequence of events could happen after the writev(): CPU1 (writeback) | CPU2 (truncate) ----------------------------------------------------------------- btrfs_writepages() | |- extent_write_cache_pages() | |- Got page for 1003520 | | 1003520 is Dirty, no writeback | | So (!clear_page_dirty_for_io()) | | gets called for it | |- Now page 1003520 is Clean. | | | btrfs_setattr() | | |- btrfs_setsize() | | |- truncate_setsize() | | New i_size is 18388 |- __extent_writepage() | | |- page_offset() > i_size | |- btrfs_invalidatepage() | |- Page is clean, so no qgroup | callback executed This means, the qgroup reserved data space is not properly released in btrfs_invalidatepage() as the page is Clean. [FIX] Instead of checking the dirty bit of a page, call btrfs_qgroup_free_data() unconditionally in btrfs_invalidatepage(). As qgroup rsv are completely bound to the QGROUP_RESERVED bit of io_tree, not bound to page status, thus we won't cause double freeing anyway. Fixes: 0b34c26 ("btrfs: qgroup: Prevent qgroup->reserved from going subzero") CC: [email protected] # 4.14+ Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: Qu Wenruo <[email protected]> Signed-off-by: David Sterba <[email protected]>
1 parent 580c079 commit fa91e4a

File tree

1 file changed

+10
-13
lines changed

1 file changed

+10
-13
lines changed

fs/btrfs/inode.c

Lines changed: 10 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -8136,20 +8136,17 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset,
81368136
/*
81378137
* Qgroup reserved space handler
81388138
* Page here will be either
8139-
* 1) Already written to disk
8140-
* In this case, its reserved space is released from data rsv map
8141-
* and will be freed by delayed_ref handler finally.
8142-
* So even we call qgroup_free_data(), it won't decrease reserved
8143-
* space.
8144-
* 2) Not written to disk
8145-
* This means the reserved space should be freed here. However,
8146-
* if a truncate invalidates the page (by clearing PageDirty)
8147-
* and the page is accounted for while allocating extent
8148-
* in btrfs_check_data_free_space() we let delayed_ref to
8149-
* free the entire extent.
8139+
* 1) Already written to disk or ordered extent already submitted
8140+
* Then its QGROUP_RESERVED bit in io_tree is already cleaned.
8141+
* Qgroup will be handled by its qgroup_record then.
8142+
* btrfs_qgroup_free_data() call will do nothing here.
8143+
*
8144+
* 2) Not written to disk yet
8145+
* Then btrfs_qgroup_free_data() call will clear the QGROUP_RESERVED
8146+
* bit of its io_tree, and free the qgroup reserved data space.
8147+
* Since the IO will never happen for this page.
81508148
*/
8151-
if (PageDirty(page))
8152-
btrfs_qgroup_free_data(inode, NULL, page_start, PAGE_SIZE);
8149+
btrfs_qgroup_free_data(inode, NULL, page_start, PAGE_SIZE);
81538150
if (!inode_evicting) {
81548151
clear_extent_bit(tree, page_start, page_end, EXTENT_LOCKED |
81558152
EXTENT_DELALLOC | EXTENT_DELALLOC_NEW |

0 commit comments

Comments
 (0)