Skip to content

Commit b63aa0f

Browse files
fdmananakdave
authored andcommitted
btrfs: avoid load/store tearing races when checking if an inode was logged
At inode_logged() we do a couple lockless checks for ->logged_trans, and these are generally safe except the second one in case we get a load or store tearing due to a concurrent call updating ->logged_trans (either at btrfs_log_inode() or later at inode_logged()). In the first case it's safe to compare to the current transaction ID since once ->logged_trans is set the current transaction, we never set it to a lower value. In the second case, where we check if it's greater than zero, we are prone to load/store tearing races, since we can have a concurrent task updating to the current transaction ID with store tearing for example, instead of updating with a single 64 bits write, to update with two 32 bits writes or four 16 bits writes. In that case the reading side at inode_logged() could see a positive value that does not match the current transaction and then return a false negative. Fix this by doing the second check while holding the inode's spinlock, add some comments about it too. Also add the data_race() annotation to the first check to avoid any reports from KCSAN (or similar tools) and comment about it. Fixes: 0f8ce49 ("btrfs: avoid inode logging during rename and link when possible") Reviewed-by: Boris Burkov <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
1 parent f675c10 commit b63aa0f

File tree

1 file changed

+21
-4
lines changed

1 file changed

+21
-4
lines changed

fs/btrfs/tree-log.c

Lines changed: 21 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3525,15 +3525,32 @@ static int inode_logged(const struct btrfs_trans_handle *trans,
35253525
struct btrfs_key key;
35263526
int ret;
35273527

3528-
if (inode->logged_trans == trans->transid)
3528+
/*
3529+
* Quick lockless call, since once ->logged_trans is set to the current
3530+
* transaction, we never set it to a lower value anywhere else.
3531+
*/
3532+
if (data_race(inode->logged_trans) == trans->transid)
35293533
return 1;
35303534

35313535
/*
3532-
* If logged_trans is not 0, then we know the inode logged was not logged
3533-
* in this transaction, so we can return false right away.
3536+
* If logged_trans is not 0 and not trans->transid, then we know the
3537+
* inode was not logged in this transaction, so we can return false
3538+
* right away. We take the lock to avoid a race caused by load/store
3539+
* tearing with a concurrent btrfs_log_inode() call or a concurrent task
3540+
* in this function further below - an update to trans->transid can be
3541+
* teared into two 32 bits updates for example, in which case we could
3542+
* see a positive value that is not trans->transid and assume the inode
3543+
* was not logged when it was.
35343544
*/
3535-
if (inode->logged_trans > 0)
3545+
spin_lock(&inode->lock);
3546+
if (inode->logged_trans == trans->transid) {
3547+
spin_unlock(&inode->lock);
3548+
return 1;
3549+
} else if (inode->logged_trans > 0) {
3550+
spin_unlock(&inode->lock);
35363551
return 0;
3552+
}
3553+
spin_unlock(&inode->lock);
35373554

35383555
/*
35393556
* If no log tree was created for this root in this transaction, then

0 commit comments

Comments
 (0)