Skip to content

Commit 25f3c50

Browse files
masonclkdave
authored andcommitted
Btrfs: keep pages dirty when using btrfs_writepage_fixup_worker
For COW, btrfs expects pages dirty pages to have been through a few setup steps. This includes reserving space for the new block allocations and marking the range in the state tree for delayed allocation. A few places outside btrfs will dirty pages directly, especially when unmapping mmap'd pages. In order for these to properly go through COW, we run them through a fixup worker to wait for stable pages, and do the delalloc prep. 87826df added a window where the dirty pages were cleaned, but pending more action from the fixup worker. We clear_page_dirty_for_io() before we call into writepage, so the page is no longer dirty. The commit changed it so now we leave the page clean between unlocking it here and the fixup worker starting at some point in the future. During this window, page migration can jump in and relocate the page. Once our fixup work actually starts, it finds page->mapping is NULL and we end up freeing the page without ever writing it. This leads to crc errors and other exciting problems, since it screws up the whole statemachine for waiting for ordered extents. The fix here is to keep the page dirty while we're waiting for the fixup worker to get to work. This is accomplished by returning -EAGAIN from btrfs_writepage_cow_fixup if we queued the page up for fixup, which will cause the writepage function to redirty the page. Because we now expect the page to be dirty once it gets to the fixup worker we must adjust the error cases to call clear_page_dirty_for_io() on the page. That is the bulk of the patch, but it is not the fix, the fix is the -EAGAIN from btrfs_writepage_cow_fixup. We cannot separate these two changes out because the error conditions change with the new expectations. Signed-off-by: Chris Mason <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
1 parent a30a3d2 commit 25f3c50

File tree

1 file changed

+44
-17
lines changed

1 file changed

+44
-17
lines changed

fs/btrfs/inode.c

Lines changed: 44 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -2202,17 +2202,27 @@ static void btrfs_writepage_fixup_worker(struct btrfs_work *work)
22022202
struct inode *inode;
22032203
u64 page_start;
22042204
u64 page_end;
2205-
int ret;
2205+
int ret = 0;
22062206

22072207
fixup = container_of(work, struct btrfs_writepage_fixup, work);
22082208
page = fixup->page;
22092209
again:
22102210
lock_page(page);
2211-
if (!page->mapping || !PageDirty(page) || !PageChecked(page)) {
2212-
ClearPageChecked(page);
2211+
2212+
/*
2213+
* Before we queued this fixup, we took a reference on the page.
2214+
* page->mapping may go NULL, but it shouldn't be moved to a different
2215+
* address space.
2216+
*/
2217+
if (!page->mapping || !PageDirty(page) || !PageChecked(page))
22132218
goto out_page;
2214-
}
22152219

2220+
/*
2221+
* We keep the PageChecked() bit set until we're done with the
2222+
* btrfs_start_ordered_extent() dance that we do below. That drops and
2223+
* retakes the page lock, so we don't want new fixup workers queued for
2224+
* this page during the churn.
2225+
*/
22162226
inode = page->mapping->host;
22172227
page_start = page_offset(page);
22182228
page_end = page_offset(page) + PAGE_SIZE - 1;
@@ -2237,24 +2247,22 @@ static void btrfs_writepage_fixup_worker(struct btrfs_work *work)
22372247

22382248
ret = btrfs_delalloc_reserve_space(inode, &data_reserved, page_start,
22392249
PAGE_SIZE);
2240-
if (ret) {
2241-
mapping_set_error(page->mapping, ret);
2242-
end_extent_writepage(page, ret, page_start, page_end);
2243-
ClearPageChecked(page);
2250+
if (ret)
22442251
goto out;
2245-
}
22462252

22472253
ret = btrfs_set_extent_delalloc(inode, page_start, page_end, 0,
22482254
&cached_state);
2249-
if (ret) {
2250-
mapping_set_error(page->mapping, ret);
2251-
end_extent_writepage(page, ret, page_start, page_end);
2252-
ClearPageChecked(page);
2255+
if (ret)
22532256
goto out_reserved;
2254-
}
22552257

2256-
ClearPageChecked(page);
2257-
set_page_dirty(page);
2258+
/*
2259+
* Everything went as planned, we're now the owner of a dirty page with
2260+
* delayed allocation bits set and space reserved for our COW
2261+
* destination.
2262+
*
2263+
* The page was dirty when we started, nothing should have cleaned it.
2264+
*/
2265+
BUG_ON(!PageDirty(page));
22582266
out_reserved:
22592267
btrfs_delalloc_release_extents(BTRFS_I(inode), PAGE_SIZE);
22602268
if (ret)
@@ -2264,6 +2272,17 @@ static void btrfs_writepage_fixup_worker(struct btrfs_work *work)
22642272
unlock_extent_cached(&BTRFS_I(inode)->io_tree, page_start, page_end,
22652273
&cached_state);
22662274
out_page:
2275+
if (ret) {
2276+
/*
2277+
* We hit ENOSPC or other errors. Update the mapping and page
2278+
* to reflect the errors and clean the page.
2279+
*/
2280+
mapping_set_error(page->mapping, ret);
2281+
end_extent_writepage(page, ret, page_start, page_end);
2282+
clear_page_dirty_for_io(page);
2283+
SetPageError(page);
2284+
}
2285+
ClearPageChecked(page);
22672286
unlock_page(page);
22682287
put_page(page);
22692288
kfree(fixup);
@@ -2291,6 +2310,13 @@ int btrfs_writepage_cow_fixup(struct page *page, u64 start, u64 end)
22912310
if (TestClearPagePrivate2(page))
22922311
return 0;
22932312

2313+
/*
2314+
* PageChecked is set below when we create a fixup worker for this page,
2315+
* don't try to create another one if we're already PageChecked()
2316+
*
2317+
* The extent_io writepage code will redirty the page if we send back
2318+
* EAGAIN.
2319+
*/
22942320
if (PageChecked(page))
22952321
return -EAGAIN;
22962322

@@ -2303,7 +2329,8 @@ int btrfs_writepage_cow_fixup(struct page *page, u64 start, u64 end)
23032329
btrfs_init_work(&fixup->work, btrfs_writepage_fixup_worker, NULL, NULL);
23042330
fixup->page = page;
23052331
btrfs_queue_work(fs_info->fixup_workers, &fixup->work);
2306-
return -EBUSY;
2332+
2333+
return -EAGAIN;
23072334
}
23082335

23092336
static int insert_reserved_file_extent(struct btrfs_trans_handle *trans,

0 commit comments

Comments
 (0)