Skip to content

Commit dd6a571

Browse files
adam900710kdave
authored andcommitted
btrfs: tree-checker: dump the page status if hit something wrong
[BUG] There is a bug report about very suspicious tree-checker got triggered: BTRFS critical (device dm-0): corrupted node, root=256 block=8550954455682405139 owner mismatch, have 11858205567642294356 expect [256, 18446744073709551360] BTRFS critical (device dm-0): corrupted node, root=256 block=8550954455682405139 owner mismatch, have 11858205567642294356 expect [256, 18446744073709551360] BTRFS critical (device dm-0): corrupted node, root=256 block=8550954455682405139 owner mismatch, have 11858205567642294356 expect [256, 18446744073709551360] SELinux: inode_doinit_use_xattr: getxattr returned 117 for dev=dm-0 ino=5737268 [ANALYZE] The root cause is still unclear, but there are some clues already: - Unaligned eb bytenr The block bytenr is 8550954455682405139, which is not even aligned to 2. This bytenr is fetched from extent buffer header, not from eb->start. This means, at the initial time of read, eb header bytenr is still correct (the very basis check to continue read), but later something wrong happened, got at least the first page corrupted. Thus we got such obviously incorrect value. - Invalid extent buffer header owner The read itself is triggered for subvolume 256, but the eb header owner is 11858205567642294356, which is not really possible. The problem here is, subvolume id is limited to (1 << 48 - 1), and this one definitely goes beyond that limit. So this value is another garbage. We already got two garbage from an extent buffer, which passed the initial bytenr and csum checks, but later the contents become garbage at some point. This looks like a page lifespan problem (e.g. we didn't properly hold the page). [ENHANCEMENT] The current tree-checker only outputs things from the extent buffer, nothing with the page status. So this patch would enhance the tree-checker output by also dumping the first page, which would look like this: page:00000000aa9f3ce8 refcount:4 mapcount:0 mapping:00000000169aa6b6 index:0x1d0c pfn:0x1022e5 memcg:ffff888103456000 aops:btree_aops [btrfs] ino:1 flags: 0x2ffff0000008000(private|node=0|zone=2|lastcpupid=0xffff) page_type: 0xffffffff() raw: 02ffff0000008000 0000000000000000 dead000000000122 ffff88811e06e220 raw: 0000000000001d0c ffff888102fdb1d8 00000004ffffffff ffff888103456000 page dumped because: eb page dump BTRFS critical (device dm-3): corrupt leaf: root=5 block=30457856 slot=6 ino=257 file_offset=0, invalid disk_bytenr for file extent, have 10617606235235216665, should be aligned to 4096 BTRFS error (device dm-3): read time tree block corruption detected on logical 30457856 mirror 1 From the dump we can see some extra info, something can help us to do extra cross-checks: - Page refcount if it's too low, it definitely means something bad. - Page aops Any mapped eb page should have btree_aops with inode number 1. - Page index Since a mapped eb page should has its bytenr matching the page position, (index << PAGE_SHIFT) should match the bytenr of the bytenr from the critical line. - Page Private flags A mapped eb page should have Private flag set to indicate it's managed by btrfs. Link: https://lore.kernel.org/linux-btrfs/CAHk-=whNdMaN9ntZ47XRKP6DBes2E5w7fi-0U3H2+PS18p+Pzw@mail.gmail.com/ Signed-off-by: Qu Wenruo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
1 parent 25da852 commit dd6a571

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

fs/btrfs/tree-checker.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ static void generic_err(const struct extent_buffer *eb, int slot,
6565
vaf.fmt = fmt;
6666
vaf.va = &args;
6767

68+
dump_page(folio_page(eb->folios[0], 0), "eb page dump");
6869
btrfs_crit(fs_info,
6970
"corrupt %s: root=%llu block=%llu slot=%d, %pV",
7071
btrfs_header_level(eb) == 0 ? "leaf" : "node",
@@ -92,6 +93,7 @@ static void file_extent_err(const struct extent_buffer *eb, int slot,
9293
vaf.fmt = fmt;
9394
vaf.va = &args;
9495

96+
dump_page(folio_page(eb->folios[0], 0), "eb page dump");
9597
btrfs_crit(fs_info,
9698
"corrupt %s: root=%llu block=%llu slot=%d ino=%llu file_offset=%llu, %pV",
9799
btrfs_header_level(eb) == 0 ? "leaf" : "node",
@@ -152,6 +154,7 @@ static void dir_item_err(const struct extent_buffer *eb, int slot,
152154
vaf.fmt = fmt;
153155
vaf.va = &args;
154156

157+
dump_page(folio_page(eb->folios[0], 0), "eb page dump");
155158
btrfs_crit(fs_info,
156159
"corrupt %s: root=%llu block=%llu slot=%d ino=%llu, %pV",
157160
btrfs_header_level(eb) == 0 ? "leaf" : "node",
@@ -647,6 +650,7 @@ static void block_group_err(const struct extent_buffer *eb, int slot,
647650
vaf.fmt = fmt;
648651
vaf.va = &args;
649652

653+
dump_page(folio_page(eb->folios[0], 0), "eb page dump");
650654
btrfs_crit(fs_info,
651655
"corrupt %s: root=%llu block=%llu slot=%d bg_start=%llu bg_len=%llu, %pV",
652656
btrfs_header_level(eb) == 0 ? "leaf" : "node",
@@ -1003,6 +1007,7 @@ static void dev_item_err(const struct extent_buffer *eb, int slot,
10031007
vaf.fmt = fmt;
10041008
vaf.va = &args;
10051009

1010+
dump_page(folio_page(eb->folios[0], 0), "eb page dump");
10061011
btrfs_crit(eb->fs_info,
10071012
"corrupt %s: root=%llu block=%llu slot=%d devid=%llu %pV",
10081013
btrfs_header_level(eb) == 0 ? "leaf" : "node",
@@ -1258,6 +1263,7 @@ static void extent_err(const struct extent_buffer *eb, int slot,
12581263
vaf.fmt = fmt;
12591264
vaf.va = &args;
12601265

1266+
dump_page(folio_page(eb->folios[0], 0), "eb page dump");
12611267
btrfs_crit(eb->fs_info,
12621268
"corrupt %s: block=%llu slot=%d extent bytenr=%llu len=%llu %pV",
12631269
btrfs_header_level(eb) == 0 ? "leaf" : "node",

0 commit comments

Comments
 (0)