Skip to content

Commit 8ca966a

Browse files
fdmananakdave
authored andcommitted
btrfs: avoid load/store tearing races when checking if an inode was logged
At inode_logged() we do a couple lockless checks for ->logged_trans, and these are generally safe except the second one in case we get a load or store tearing due to a concurrent call updating ->logged_trans (either at btrfs_log_inode() or later at inode_logged()). In the first case it's safe to compare to the current transaction ID since once ->logged_trans is set the current transaction, we never set it to a lower value. In the second case, where we check if it's greater than zero, we are prone to load/store tearing races, since we can have a concurrent task updating to the current transaction ID with store tearing for example, instead of updating with a single 64 bits write, to update with two 32 bits writes or four 16 bits writes. In that case the reading side at inode_logged() could see a positive value that does not match the current transaction and then return a false negative. Fix this by doing the second check while holding the inode's spinlock, add some comments about it too. Also add the data_race() annotation to the first check to avoid any reports from KCSAN (or similar tools) and comment about it. Fixes: 0f8ce49 ("btrfs: avoid inode logging during rename and link when possible") Reviewed-by: Boris Burkov <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
1 parent 1041b71 commit 8ca966a

File tree

1 file changed

+21
-4
lines changed

1 file changed

+21
-4
lines changed

fs/btrfs/tree-log.c

Lines changed: 21 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3531,15 +3531,32 @@ static int inode_logged(const struct btrfs_trans_handle *trans,
35313531
struct btrfs_key key;
35323532
int ret;
35333533

3534-
if (inode->logged_trans == trans->transid)
3534+
/*
3535+
* Quick lockless call, since once ->logged_trans is set to the current
3536+
* transaction, we never set it to a lower value anywhere else.
3537+
*/
3538+
if (data_race(inode->logged_trans) == trans->transid)
35353539
return 1;
35363540

35373541
/*
3538-
* If logged_trans is not 0, then we know the inode logged was not logged
3539-
* in this transaction, so we can return false right away.
3542+
* If logged_trans is not 0 and not trans->transid, then we know the
3543+
* inode was not logged in this transaction, so we can return false
3544+
* right away. We take the lock to avoid a race caused by load/store
3545+
* tearing with a concurrent btrfs_log_inode() call or a concurrent task
3546+
* in this function further below - an update to trans->transid can be
3547+
* teared into two 32 bits updates for example, in which case we could
3548+
* see a positive value that is not trans->transid and assume the inode
3549+
* was not logged when it was.
35403550
*/
3541-
if (inode->logged_trans > 0)
3551+
spin_lock(&inode->lock);
3552+
if (inode->logged_trans == trans->transid) {
3553+
spin_unlock(&inode->lock);
3554+
return 1;
3555+
} else if (inode->logged_trans > 0) {
3556+
spin_unlock(&inode->lock);
35423557
return 0;
3558+
}
3559+
spin_unlock(&inode->lock);
35433560

35443561
/*
35453562
* If no log tree was created for this root in this transaction, then

0 commit comments

Comments
 (0)