Skip to content

Commit 79b6fad

Browse files
committed
Merge tag 'xfs-6.4-rc5-fixes' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Dave Chinner: "These are a set of regression fixes discovered on recent kernels. I was hoping to send this to you a week and half ago, but events out of my control delayed finalising the changes until early this week. Whilst the diffstat looks large for this stage of the merge window, a large chunk of it comes from moving the guts of one function from one file to another i.e. it's the same code, it is just run in a different context where it is safe to hold a specific lock. Otherwise the individual changes are relatively small and straigtht forward. Summary: - Propagate unlinked inode list corruption back up to log recovery (regression fix) - improve corruption detection for AGFL entries, AGFL indexes and XEFI extents (syzkaller fuzzer oops report) - Avoid double perag reference release (regression fix) - Improve extent merging detection in scrub (regression fix) - Fix a new undefined high bit shift (regression fix) - Fix for AGF vs inode cluster buffer deadlock (regression fix)" * tag 'xfs-6.4-rc5-fixes' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: collect errors from inodegc for unlinked inode recovery xfs: validate block number being freed before adding to xefi xfs: validity check agbnos on the AGFL xfs: fix agf/agfl verification on v4 filesystems xfs: fix double xfs_perag_rele() in xfs_filestream_pick_ag() xfs: fix broken logic when detecting mergeable bmap records xfs: Fix undefined behavior of shift into sign bit xfs: fix AGF vs inode cluster buffer deadlock xfs: defered work could create precommits xfs: restore allocation trylock iteration xfs: buffer pins need to hold a buffer reference
2 parents 5f63595 + d4d12c0 commit 79b6fad

24 files changed

+427
-229
lines changed

fs/xfs/libxfs/xfs_ag.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -984,7 +984,10 @@ xfs_ag_shrink_space(
984984
if (err2 != -ENOSPC)
985985
goto resv_err;
986986

987-
__xfs_free_extent_later(*tpp, args.fsbno, delta, NULL, true);
987+
err2 = __xfs_free_extent_later(*tpp, args.fsbno, delta, NULL,
988+
true);
989+
if (err2)
990+
goto resv_err;
988991

989992
/*
990993
* Roll the transaction before trying to re-init the per-ag

fs/xfs/libxfs/xfs_alloc.c

Lines changed: 65 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -628,6 +628,25 @@ xfs_alloc_fixup_trees(
628628
return 0;
629629
}
630630

631+
/*
632+
* We do not verify the AGFL contents against AGF-based index counters here,
633+
* even though we may have access to the perag that contains shadow copies. We
634+
* don't know if the AGF based counters have been checked, and if they have they
635+
* still may be inconsistent because they haven't yet been reset on the first
636+
* allocation after the AGF has been read in.
637+
*
638+
* This means we can only check that all agfl entries contain valid or null
639+
* values because we can't reliably determine the active range to exclude
640+
* NULLAGBNO as a valid value.
641+
*
642+
* However, we can't even do that for v4 format filesystems because there are
643+
* old versions of mkfs out there that does not initialise the AGFL to known,
644+
* verifiable values. HEnce we can't tell the difference between a AGFL block
645+
* allocated by mkfs and a corrupted AGFL block here on v4 filesystems.
646+
*
647+
* As a result, we can only fully validate AGFL block numbers when we pull them
648+
* from the freelist in xfs_alloc_get_freelist().
649+
*/
631650
static xfs_failaddr_t
632651
xfs_agfl_verify(
633652
struct xfs_buf *bp)
@@ -637,12 +656,6 @@ xfs_agfl_verify(
637656
__be32 *agfl_bno = xfs_buf_to_agfl_bno(bp);
638657
int i;
639658

640-
/*
641-
* There is no verification of non-crc AGFLs because mkfs does not
642-
* initialise the AGFL to zero or NULL. Hence the only valid part of the
643-
* AGFL is what the AGF says is active. We can't get to the AGF, so we
644-
* can't verify just those entries are valid.
645-
*/
646659
if (!xfs_has_crc(mp))
647660
return NULL;
648661

@@ -2321,12 +2334,16 @@ xfs_free_agfl_block(
23212334
}
23222335

23232336
/*
2324-
* Check the agfl fields of the agf for inconsistency or corruption. The purpose
2325-
* is to detect an agfl header padding mismatch between current and early v5
2326-
* kernels. This problem manifests as a 1-slot size difference between the
2327-
* on-disk flcount and the active [first, last] range of a wrapped agfl. This
2328-
* may also catch variants of agfl count corruption unrelated to padding. Either
2329-
* way, we'll reset the agfl and warn the user.
2337+
* Check the agfl fields of the agf for inconsistency or corruption.
2338+
*
2339+
* The original purpose was to detect an agfl header padding mismatch between
2340+
* current and early v5 kernels. This problem manifests as a 1-slot size
2341+
* difference between the on-disk flcount and the active [first, last] range of
2342+
* a wrapped agfl.
2343+
*
2344+
* However, we need to use these same checks to catch agfl count corruptions
2345+
* unrelated to padding. This could occur on any v4 or v5 filesystem, so either
2346+
* way, we need to reset the agfl and warn the user.
23302347
*
23312348
* Return true if a reset is required before the agfl can be used, false
23322349
* otherwise.
@@ -2342,10 +2359,6 @@ xfs_agfl_needs_reset(
23422359
int agfl_size = xfs_agfl_size(mp);
23432360
int active;
23442361

2345-
/* no agfl header on v4 supers */
2346-
if (!xfs_has_crc(mp))
2347-
return false;
2348-
23492362
/*
23502363
* The agf read verifier catches severe corruption of these fields.
23512364
* Repeat some sanity checks to cover a packed -> unpacked mismatch if
@@ -2418,7 +2431,7 @@ xfs_agfl_reset(
24182431
* the real allocation can proceed. Deferring the free disconnects freeing up
24192432
* the AGFL slot from freeing the block.
24202433
*/
2421-
STATIC void
2434+
static int
24222435
xfs_defer_agfl_block(
24232436
struct xfs_trans *tp,
24242437
xfs_agnumber_t agno,
@@ -2437,17 +2450,21 @@ xfs_defer_agfl_block(
24372450
xefi->xefi_blockcount = 1;
24382451
xefi->xefi_owner = oinfo->oi_owner;
24392452

2453+
if (XFS_IS_CORRUPT(mp, !xfs_verify_fsbno(mp, xefi->xefi_startblock)))
2454+
return -EFSCORRUPTED;
2455+
24402456
trace_xfs_agfl_free_defer(mp, agno, 0, agbno, 1);
24412457

24422458
xfs_extent_free_get_group(mp, xefi);
24432459
xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_AGFL_FREE, &xefi->xefi_list);
2460+
return 0;
24442461
}
24452462

24462463
/*
24472464
* Add the extent to the list of extents to be free at transaction end.
24482465
* The list is maintained sorted (by block number).
24492466
*/
2450-
void
2467+
int
24512468
__xfs_free_extent_later(
24522469
struct xfs_trans *tp,
24532470
xfs_fsblock_t bno,
@@ -2474,6 +2491,9 @@ __xfs_free_extent_later(
24742491
#endif
24752492
ASSERT(xfs_extfree_item_cache != NULL);
24762493

2494+
if (XFS_IS_CORRUPT(mp, !xfs_verify_fsbext(mp, bno, len)))
2495+
return -EFSCORRUPTED;
2496+
24772497
xefi = kmem_cache_zalloc(xfs_extfree_item_cache,
24782498
GFP_KERNEL | __GFP_NOFAIL);
24792499
xefi->xefi_startblock = bno;
@@ -2497,6 +2517,7 @@ __xfs_free_extent_later(
24972517

24982518
xfs_extent_free_get_group(mp, xefi);
24992519
xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_FREE, &xefi->xefi_list);
2520+
return 0;
25002521
}
25012522

25022523
#ifdef DEBUG
@@ -2657,7 +2678,9 @@ xfs_alloc_fix_freelist(
26572678
goto out_agbp_relse;
26582679

26592680
/* defer agfl frees */
2660-
xfs_defer_agfl_block(tp, args->agno, bno, &targs.oinfo);
2681+
error = xfs_defer_agfl_block(tp, args->agno, bno, &targs.oinfo);
2682+
if (error)
2683+
goto out_agbp_relse;
26612684
}
26622685

26632686
targs.tp = tp;
@@ -2767,6 +2790,9 @@ xfs_alloc_get_freelist(
27672790
*/
27682791
agfl_bno = xfs_buf_to_agfl_bno(agflbp);
27692792
bno = be32_to_cpu(agfl_bno[be32_to_cpu(agf->agf_flfirst)]);
2793+
if (XFS_IS_CORRUPT(tp->t_mountp, !xfs_verify_agbno(pag, bno)))
2794+
return -EFSCORRUPTED;
2795+
27702796
be32_add_cpu(&agf->agf_flfirst, 1);
27712797
xfs_trans_brelse(tp, agflbp);
27722798
if (be32_to_cpu(agf->agf_flfirst) == xfs_agfl_size(mp))
@@ -2889,6 +2915,19 @@ xfs_alloc_put_freelist(
28892915
return 0;
28902916
}
28912917

2918+
/*
2919+
* Verify the AGF is consistent.
2920+
*
2921+
* We do not verify the AGFL indexes in the AGF are fully consistent here
2922+
* because of issues with variable on-disk structure sizes. Instead, we check
2923+
* the agfl indexes for consistency when we initialise the perag from the AGF
2924+
* information after a read completes.
2925+
*
2926+
* If the index is inconsistent, then we mark the perag as needing an AGFL
2927+
* reset. The first AGFL update performed then resets the AGFL indexes and
2928+
* refills the AGFL with known good free blocks, allowing the filesystem to
2929+
* continue operating normally at the cost of a few leaked free space blocks.
2930+
*/
28922931
static xfs_failaddr_t
28932932
xfs_agf_verify(
28942933
struct xfs_buf *bp)
@@ -2962,7 +3001,6 @@ xfs_agf_verify(
29623001
return __this_address;
29633002

29643003
return NULL;
2965-
29663004
}
29673005

29683006
static void
@@ -3187,7 +3225,8 @@ xfs_alloc_vextent_check_args(
31873225
*/
31883226
static int
31893227
xfs_alloc_vextent_prepare_ag(
3190-
struct xfs_alloc_arg *args)
3228+
struct xfs_alloc_arg *args,
3229+
uint32_t flags)
31913230
{
31923231
bool need_pag = !args->pag;
31933232
int error;
@@ -3196,7 +3235,7 @@ xfs_alloc_vextent_prepare_ag(
31963235
args->pag = xfs_perag_get(args->mp, args->agno);
31973236

31983237
args->agbp = NULL;
3199-
error = xfs_alloc_fix_freelist(args, 0);
3238+
error = xfs_alloc_fix_freelist(args, flags);
32003239
if (error) {
32013240
trace_xfs_alloc_vextent_nofix(args);
32023241
if (need_pag)
@@ -3336,7 +3375,7 @@ xfs_alloc_vextent_this_ag(
33363375
return error;
33373376
}
33383377

3339-
error = xfs_alloc_vextent_prepare_ag(args);
3378+
error = xfs_alloc_vextent_prepare_ag(args, 0);
33403379
if (!error && args->agbp)
33413380
error = xfs_alloc_ag_vextent_size(args);
33423381

@@ -3380,7 +3419,7 @@ xfs_alloc_vextent_iterate_ags(
33803419
for_each_perag_wrap_range(mp, start_agno, restart_agno,
33813420
mp->m_sb.sb_agcount, agno, args->pag) {
33823421
args->agno = agno;
3383-
error = xfs_alloc_vextent_prepare_ag(args);
3422+
error = xfs_alloc_vextent_prepare_ag(args, flags);
33843423
if (error)
33853424
break;
33863425
if (!args->agbp) {
@@ -3546,7 +3585,7 @@ xfs_alloc_vextent_exact_bno(
35463585
return error;
35473586
}
35483587

3549-
error = xfs_alloc_vextent_prepare_ag(args);
3588+
error = xfs_alloc_vextent_prepare_ag(args, 0);
35503589
if (!error && args->agbp)
35513590
error = xfs_alloc_ag_vextent_exact(args);
35523591

@@ -3587,7 +3626,7 @@ xfs_alloc_vextent_near_bno(
35873626
if (needs_perag)
35883627
args->pag = xfs_perag_grab(mp, args->agno);
35893628

3590-
error = xfs_alloc_vextent_prepare_ag(args);
3629+
error = xfs_alloc_vextent_prepare_ag(args, 0);
35913630
if (!error && args->agbp)
35923631
error = xfs_alloc_ag_vextent_near(args);
35933632

fs/xfs/libxfs/xfs_alloc.h

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -230,7 +230,7 @@ xfs_buf_to_agfl_bno(
230230
return bp->b_addr;
231231
}
232232

233-
void __xfs_free_extent_later(struct xfs_trans *tp, xfs_fsblock_t bno,
233+
int __xfs_free_extent_later(struct xfs_trans *tp, xfs_fsblock_t bno,
234234
xfs_filblks_t len, const struct xfs_owner_info *oinfo,
235235
bool skip_discard);
236236

@@ -254,14 +254,14 @@ void xfs_extent_free_get_group(struct xfs_mount *mp,
254254
#define XFS_EFI_ATTR_FORK (1U << 1) /* freeing attr fork block */
255255
#define XFS_EFI_BMBT_BLOCK (1U << 2) /* freeing bmap btree block */
256256

257-
static inline void
257+
static inline int
258258
xfs_free_extent_later(
259259
struct xfs_trans *tp,
260260
xfs_fsblock_t bno,
261261
xfs_filblks_t len,
262262
const struct xfs_owner_info *oinfo)
263263
{
264-
__xfs_free_extent_later(tp, bno, len, oinfo, false);
264+
return __xfs_free_extent_later(tp, bno, len, oinfo, false);
265265
}
266266

267267

fs/xfs/libxfs/xfs_bmap.c

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -572,8 +572,12 @@ xfs_bmap_btree_to_extents(
572572
cblock = XFS_BUF_TO_BLOCK(cbp);
573573
if ((error = xfs_btree_check_block(cur, cblock, 0, cbp)))
574574
return error;
575+
575576
xfs_rmap_ino_bmbt_owner(&oinfo, ip->i_ino, whichfork);
576-
xfs_free_extent_later(cur->bc_tp, cbno, 1, &oinfo);
577+
error = xfs_free_extent_later(cur->bc_tp, cbno, 1, &oinfo);
578+
if (error)
579+
return error;
580+
577581
ip->i_nblocks--;
578582
xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, -1L);
579583
xfs_trans_binval(tp, cbp);
@@ -5230,10 +5234,12 @@ xfs_bmap_del_extent_real(
52305234
if (xfs_is_reflink_inode(ip) && whichfork == XFS_DATA_FORK) {
52315235
xfs_refcount_decrease_extent(tp, del);
52325236
} else {
5233-
__xfs_free_extent_later(tp, del->br_startblock,
5237+
error = __xfs_free_extent_later(tp, del->br_startblock,
52345238
del->br_blockcount, NULL,
52355239
(bflags & XFS_BMAPI_NODISCARD) ||
52365240
del->br_state == XFS_EXT_UNWRITTEN);
5241+
if (error)
5242+
goto done;
52375243
}
52385244
}
52395245

fs/xfs/libxfs/xfs_bmap_btree.c

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -268,11 +268,14 @@ xfs_bmbt_free_block(
268268
struct xfs_trans *tp = cur->bc_tp;
269269
xfs_fsblock_t fsbno = XFS_DADDR_TO_FSB(mp, xfs_buf_daddr(bp));
270270
struct xfs_owner_info oinfo;
271+
int error;
271272

272273
xfs_rmap_ino_bmbt_owner(&oinfo, ip->i_ino, cur->bc_ino.whichfork);
273-
xfs_free_extent_later(cur->bc_tp, fsbno, 1, &oinfo);
274-
ip->i_nblocks--;
274+
error = xfs_free_extent_later(cur->bc_tp, fsbno, 1, &oinfo);
275+
if (error)
276+
return error;
275277

278+
ip->i_nblocks--;
276279
xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
277280
xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, -1L);
278281
return 0;

fs/xfs/libxfs/xfs_ialloc.c

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1834,7 +1834,7 @@ xfs_dialloc(
18341834
* might be sparse and only free the regions that are allocated as part of the
18351835
* chunk.
18361836
*/
1837-
STATIC void
1837+
static int
18381838
xfs_difree_inode_chunk(
18391839
struct xfs_trans *tp,
18401840
xfs_agnumber_t agno,
@@ -1851,10 +1851,10 @@ xfs_difree_inode_chunk(
18511851

18521852
if (!xfs_inobt_issparse(rec->ir_holemask)) {
18531853
/* not sparse, calculate extent info directly */
1854-
xfs_free_extent_later(tp, XFS_AGB_TO_FSB(mp, agno, sagbno),
1855-
M_IGEO(mp)->ialloc_blks,
1856-
&XFS_RMAP_OINFO_INODES);
1857-
return;
1854+
return xfs_free_extent_later(tp,
1855+
XFS_AGB_TO_FSB(mp, agno, sagbno),
1856+
M_IGEO(mp)->ialloc_blks,
1857+
&XFS_RMAP_OINFO_INODES);
18581858
}
18591859

18601860
/* holemask is only 16-bits (fits in an unsigned long) */
@@ -1871,6 +1871,8 @@ xfs_difree_inode_chunk(
18711871
XFS_INOBT_HOLEMASK_BITS);
18721872
nextbit = startidx + 1;
18731873
while (startidx < XFS_INOBT_HOLEMASK_BITS) {
1874+
int error;
1875+
18741876
nextbit = find_next_zero_bit(holemask, XFS_INOBT_HOLEMASK_BITS,
18751877
nextbit);
18761878
/*
@@ -1896,15 +1898,19 @@ xfs_difree_inode_chunk(
18961898

18971899
ASSERT(agbno % mp->m_sb.sb_spino_align == 0);
18981900
ASSERT(contigblk % mp->m_sb.sb_spino_align == 0);
1899-
xfs_free_extent_later(tp, XFS_AGB_TO_FSB(mp, agno, agbno),
1900-
contigblk, &XFS_RMAP_OINFO_INODES);
1901+
error = xfs_free_extent_later(tp,
1902+
XFS_AGB_TO_FSB(mp, agno, agbno),
1903+
contigblk, &XFS_RMAP_OINFO_INODES);
1904+
if (error)
1905+
return error;
19011906

19021907
/* reset range to current bit and carry on... */
19031908
startidx = endidx = nextbit;
19041909

19051910
next:
19061911
nextbit++;
19071912
}
1913+
return 0;
19081914
}
19091915

19101916
STATIC int
@@ -2003,7 +2009,9 @@ xfs_difree_inobt(
20032009
goto error0;
20042010
}
20052011

2006-
xfs_difree_inode_chunk(tp, pag->pag_agno, &rec);
2012+
error = xfs_difree_inode_chunk(tp, pag->pag_agno, &rec);
2013+
if (error)
2014+
goto error0;
20072015
} else {
20082016
xic->deleted = false;
20092017

fs/xfs/libxfs/xfs_log_format.h

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -324,7 +324,6 @@ struct xfs_inode_log_format_32 {
324324
#define XFS_ILOG_DOWNER 0x200 /* change the data fork owner on replay */
325325
#define XFS_ILOG_AOWNER 0x400 /* change the attr fork owner on replay */
326326

327-
328327
/*
329328
* The timestamps are dirty, but not necessarily anything else in the inode
330329
* core. Unlike the other fields above this one must never make it to disk
@@ -333,6 +332,14 @@ struct xfs_inode_log_format_32 {
333332
*/
334333
#define XFS_ILOG_TIMESTAMP 0x4000
335334

335+
/*
336+
* The version field has been changed, but not necessarily anything else of
337+
* interest. This must never make it to disk - it is used purely to ensure that
338+
* the inode item ->precommit operation can update the fsync flag triggers
339+
* in the inode item correctly.
340+
*/
341+
#define XFS_ILOG_IVERSION 0x8000
342+
336343
#define XFS_ILOG_NONCORE (XFS_ILOG_DDATA | XFS_ILOG_DEXT | \
337344
XFS_ILOG_DBROOT | XFS_ILOG_DEV | \
338345
XFS_ILOG_ADATA | XFS_ILOG_AEXT | \

0 commit comments

Comments
 (0)