Skip to content

Commit e0a8de7

Browse files
Dave Chinnerdchinner
authored andcommitted
xfs: fix agf/agfl verification on v4 filesystems
When a v4 filesystem has fl_last - fl_first != fl_count, we do not not detect the corruption and allow the AGF to be used as it if was fully valid. On V5 filesystems, we reset the AGFL to empty in these cases and avoid the corruption at a small cost of leaked blocks. If we don't catch the corruption on V4 filesystems, bad things happen later when an allocation attempts to trim the free list and either double-frees stale entries in the AGFl or tries to free NULLAGBNO entries. Either way, this is bad. Prevent this from happening by using the AGFL_NEED_RESET logic for v4 filesysetms, too. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
1 parent 1e47327 commit e0a8de7

File tree

1 file changed

+42
-17
lines changed

1 file changed

+42
-17
lines changed

fs/xfs/libxfs/xfs_alloc.c

Lines changed: 42 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -628,6 +628,25 @@ xfs_alloc_fixup_trees(
628628
return 0;
629629
}
630630

631+
/*
632+
* We do not verify the AGFL contents against AGF-based index counters here,
633+
* even though we may have access to the perag that contains shadow copies. We
634+
* don't know if the AGF based counters have been checked, and if they have they
635+
* still may be inconsistent because they haven't yet been reset on the first
636+
* allocation after the AGF has been read in.
637+
*
638+
* This means we can only check that all agfl entries contain valid or null
639+
* values because we can't reliably determine the active range to exclude
640+
* NULLAGBNO as a valid value.
641+
*
642+
* However, we can't even do that for v4 format filesystems because there are
643+
* old versions of mkfs out there that does not initialise the AGFL to known,
644+
* verifiable values. HEnce we can't tell the difference between a AGFL block
645+
* allocated by mkfs and a corrupted AGFL block here on v4 filesystems.
646+
*
647+
* As a result, we can only fully validate AGFL block numbers when we pull them
648+
* from the freelist in xfs_alloc_get_freelist().
649+
*/
631650
static xfs_failaddr_t
632651
xfs_agfl_verify(
633652
struct xfs_buf *bp)
@@ -637,12 +656,6 @@ xfs_agfl_verify(
637656
__be32 *agfl_bno = xfs_buf_to_agfl_bno(bp);
638657
int i;
639658

640-
/*
641-
* There is no verification of non-crc AGFLs because mkfs does not
642-
* initialise the AGFL to zero or NULL. Hence the only valid part of the
643-
* AGFL is what the AGF says is active. We can't get to the AGF, so we
644-
* can't verify just those entries are valid.
645-
*/
646659
if (!xfs_has_crc(mp))
647660
return NULL;
648661

@@ -2321,12 +2334,16 @@ xfs_free_agfl_block(
23212334
}
23222335

23232336
/*
2324-
* Check the agfl fields of the agf for inconsistency or corruption. The purpose
2325-
* is to detect an agfl header padding mismatch between current and early v5
2326-
* kernels. This problem manifests as a 1-slot size difference between the
2327-
* on-disk flcount and the active [first, last] range of a wrapped agfl. This
2328-
* may also catch variants of agfl count corruption unrelated to padding. Either
2329-
* way, we'll reset the agfl and warn the user.
2337+
* Check the agfl fields of the agf for inconsistency or corruption.
2338+
*
2339+
* The original purpose was to detect an agfl header padding mismatch between
2340+
* current and early v5 kernels. This problem manifests as a 1-slot size
2341+
* difference between the on-disk flcount and the active [first, last] range of
2342+
* a wrapped agfl.
2343+
*
2344+
* However, we need to use these same checks to catch agfl count corruptions
2345+
* unrelated to padding. This could occur on any v4 or v5 filesystem, so either
2346+
* way, we need to reset the agfl and warn the user.
23302347
*
23312348
* Return true if a reset is required before the agfl can be used, false
23322349
* otherwise.
@@ -2342,10 +2359,6 @@ xfs_agfl_needs_reset(
23422359
int agfl_size = xfs_agfl_size(mp);
23432360
int active;
23442361

2345-
/* no agfl header on v4 supers */
2346-
if (!xfs_has_crc(mp))
2347-
return false;
2348-
23492362
/*
23502363
* The agf read verifier catches severe corruption of these fields.
23512364
* Repeat some sanity checks to cover a packed -> unpacked mismatch if
@@ -2889,6 +2902,19 @@ xfs_alloc_put_freelist(
28892902
return 0;
28902903
}
28912904

2905+
/*
2906+
* Verify the AGF is consistent.
2907+
*
2908+
* We do not verify the AGFL indexes in the AGF are fully consistent here
2909+
* because of issues with variable on-disk structure sizes. Instead, we check
2910+
* the agfl indexes for consistency when we initialise the perag from the AGF
2911+
* information after a read completes.
2912+
*
2913+
* If the index is inconsistent, then we mark the perag as needing an AGFL
2914+
* reset. The first AGFL update performed then resets the AGFL indexes and
2915+
* refills the AGFL with known good free blocks, allowing the filesystem to
2916+
* continue operating normally at the cost of a few leaked free space blocks.
2917+
*/
28922918
static xfs_failaddr_t
28932919
xfs_agf_verify(
28942920
struct xfs_buf *bp)
@@ -2962,7 +2988,6 @@ xfs_agf_verify(
29622988
return __this_address;
29632989

29642990
return NULL;
2965-
29662991
}
29672992

29682993
static void

0 commit comments

Comments
 (0)