Skip to content

Commit 3777369

Browse files
adam900710kdave
authored andcommitted
btrfs: verify the tranisd of the to-be-written dirty extent buffer
[BUG] There is a bug report that a bitflip in the transid part of an extent buffer makes btrfs to reject certain tree blocks: BTRFS error (device dm-0): parent transid verify failed on 1382301696 wanted 262166 found 22 [CAUSE] Note the failed transid check, hex(262166) = 0x40016, while hex(22) = 0x16. It's an obvious bitflip. Furthermore, the reporter also confirmed the bitflip is from the hardware, so it's a real hardware caused bitflip, and such problem can not be detected by the existing tree-checker framework. As tree-checker can only verify the content inside one tree block, while generation of a tree block can only be verified against its parent. So such problem remain undetected. [FIX] Although tree-checker can not verify it at write-time, we still have a quick (but not the most accurate) way to catch such obvious corruption. Function csum_one_extent_buffer() is called before we submit metadata write. Thus it means, all the extent buffer passed in should be dirty tree blocks, and should be newer than last committed transaction. Using that we can catch the above bitflip. Although it's not a perfect solution, as if the corrupted generation is higher than the correct value, we have no way to catch it at all. Reported-by: Christoph Anton Mitterer <[email protected]> Link: https://lore.kernel.org/linux-btrfs/[email protected]/ CC: [email protected] # 5.15+ Signed-off-by: Qu Wenruo <wqu@sus,ree.com> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
1 parent 9a4ffa1 commit 3777369

File tree

1 file changed

+20
-6
lines changed

1 file changed

+20
-6
lines changed

fs/btrfs/disk-io.c

Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -441,17 +441,31 @@ static int csum_one_extent_buffer(struct extent_buffer *eb)
441441
else
442442
ret = btrfs_check_leaf_full(eb);
443443

444-
if (ret < 0) {
445-
btrfs_print_tree(eb, 0);
444+
if (ret < 0)
445+
goto error;
446+
447+
/*
448+
* Also check the generation, the eb reached here must be newer than
449+
* last committed. Or something seriously wrong happened.
450+
*/
451+
if (unlikely(btrfs_header_generation(eb) <= fs_info->last_trans_committed)) {
452+
ret = -EUCLEAN;
446453
btrfs_err(fs_info,
447-
"block=%llu write time tree block corruption detected",
448-
eb->start);
449-
WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
450-
return ret;
454+
"block=%llu bad generation, have %llu expect > %llu",
455+
eb->start, btrfs_header_generation(eb),
456+
fs_info->last_trans_committed);
457+
goto error;
451458
}
452459
write_extent_buffer(eb, result, 0, fs_info->csum_size);
453460

454461
return 0;
462+
463+
error:
464+
btrfs_print_tree(eb, 0);
465+
btrfs_err(fs_info, "block=%llu write time tree block corruption detected",
466+
eb->start);
467+
WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
468+
return ret;
455469
}
456470

457471
/* Checksum all dirty extent buffers in one bio_vec */

0 commit comments

Comments
 (0)