Skip to content

Commit 35c5a09

Browse files
author
Darrick J. Wong
committed
Merge tag 'xfs-buf-lockless-lookup-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs into xfs-5.20-mergeB
xfs: lockless buffer cache lookups Current work to merge the XFS inode life cycle with the VFS inode life cycle is finding some interesting issues. If we have a path that hits buffer trylocks fairly hard (e.g. a non-blocking background inode freeing function), we end up hitting massive contention on the buffer cache hash locks: - 92.71% 0.05% [kernel] [k] xfs_inodegc_worker - 92.67% xfs_inodegc_worker - 92.13% xfs_inode_unlink - 91.52% xfs_inactive_ifree - 85.63% xfs_read_agi - 85.61% xfs_trans_read_buf_map - 85.59% xfs_buf_read_map - xfs_buf_get_map - 85.55% xfs_buf_find - 72.87% _raw_spin_lock - do_raw_spin_lock 71.86% __pv_queued_spin_lock_slowpath - 8.74% xfs_buf_rele - 7.88% _raw_spin_lock - 7.88% do_raw_spin_lock 7.63% __pv_queued_spin_lock_slowpath - 1.70% xfs_buf_trylock - 1.68% down_trylock - 1.41% _raw_spin_lock_irqsave - 1.39% do_raw_spin_lock __pv_queued_spin_lock_slowpath - 0.76% _raw_spin_unlock 0.75% do_raw_spin_unlock This is basically hammering the pag->pag_buf_lock from lots of CPUs doing trylocks at the same time. Most of the buffer trylock operations ultimately fail after we've done the lookup, so we're really hammering the buf hash lock whilst making no progress. We can also see significant spinlock traffic on the same lock just under normal operation when lots of tasks are accessing metadata from the same AG, so let's avoid all this by creating a lookup fast path which leverages the rhashtable's ability to do RCU protected lookups. Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> * tag 'xfs-buf-lockless-lookup-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: xfs: lockless buffer lookup xfs: remove a superflous hash lookup when inserting new buffers xfs: reduce the number of atomic when locking a buffer after lookup xfs: merge xfs_buf_find() and xfs_buf_get_map() xfs: break up xfs_buf_find() into individual pieces xfs: rework xfs_buf_incore() API
2 parents 4613b17 + 298f342 commit 35c5a09

File tree

5 files changed

+188
-135
lines changed

5 files changed

+188
-135
lines changed

fs/xfs/libxfs/xfs_attr_remote.c

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -543,21 +543,26 @@ xfs_attr_rmtval_stale(
543543
{
544544
struct xfs_mount *mp = ip->i_mount;
545545
struct xfs_buf *bp;
546+
int error;
546547

547548
ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
548549

549550
if (XFS_IS_CORRUPT(mp, map->br_startblock == DELAYSTARTBLOCK) ||
550551
XFS_IS_CORRUPT(mp, map->br_startblock == HOLESTARTBLOCK))
551552
return -EFSCORRUPTED;
552553

553-
bp = xfs_buf_incore(mp->m_ddev_targp,
554+
error = xfs_buf_incore(mp->m_ddev_targp,
554555
XFS_FSB_TO_DADDR(mp, map->br_startblock),
555-
XFS_FSB_TO_BB(mp, map->br_blockcount), incore_flags);
556-
if (bp) {
557-
xfs_buf_stale(bp);
558-
xfs_buf_relse(bp);
556+
XFS_FSB_TO_BB(mp, map->br_blockcount),
557+
incore_flags, &bp);
558+
if (error) {
559+
if (error == -ENOENT)
560+
return 0;
561+
return error;
559562
}
560563

564+
xfs_buf_stale(bp);
565+
xfs_buf_relse(bp);
561566
return 0;
562567
}
563568

fs/xfs/scrub/repair.c

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -454,16 +454,19 @@ xrep_invalidate_blocks(
454454
* assume it's owned by someone else.
455455
*/
456456
for_each_xbitmap_block(fsbno, bmr, n, bitmap) {
457+
int error;
458+
457459
/* Skip AG headers and post-EOFS blocks */
458460
if (!xfs_verify_fsbno(sc->mp, fsbno))
459461
continue;
460-
bp = xfs_buf_incore(sc->mp->m_ddev_targp,
462+
error = xfs_buf_incore(sc->mp->m_ddev_targp,
461463
XFS_FSB_TO_DADDR(sc->mp, fsbno),
462-
XFS_FSB_TO_BB(sc->mp, 1), XBF_TRYLOCK);
463-
if (bp) {
464-
xfs_trans_bjoin(sc->tp, bp);
465-
xfs_trans_binval(sc->tp, bp);
466-
}
464+
XFS_FSB_TO_BB(sc->mp, 1), XBF_TRYLOCK, &bp);
465+
if (error)
466+
continue;
467+
468+
xfs_trans_bjoin(sc->tp, bp);
469+
xfs_trans_binval(sc->tp, bp);
467470
}
468471

469472
return 0;

0 commit comments

Comments
 (0)