Skip to content

Commit 986bf6e

Browse files
fdmananakdave
authored andcommitted
btrfs: avoid load/store tearing races when checking if an inode was logged
At inode_logged() we do a couple lockless checks for ->logged_trans, and these are generally safe except the second one in case we get a load or store tearing due to a concurrent call updating ->logged_trans (either at btrfs_log_inode() or later at inode_logged()). In the first case it's safe to compare to the current transaction ID since once ->logged_trans is set the current transaction, we never set it to a lower value. In the second case, where we check if it's greater than zero, we are prone to load/store tearing races, since we can have a concurrent task updating to the current transaction ID with store tearing for example, instead of updating with a single 64 bits write, to update with two 32 bits writes or four 16 bits writes. In that case the reading side at inode_logged() could see a positive value that does not match the current transaction and then return a false negative. Fix this by doing the second check while holding the inode's spinlock, add some comments about it too. Also add the data_race() annotation to the first check to avoid any reports from KCSAN (or similar tools) and comment about it. Fixes: 0f8ce49 ("btrfs: avoid inode logging during rename and link when possible") Reviewed-by: Boris Burkov <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
1 parent 59a0dd4 commit 986bf6e

File tree

1 file changed

+21
-4
lines changed

1 file changed

+21
-4
lines changed

fs/btrfs/tree-log.c

Lines changed: 21 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3382,15 +3382,32 @@ static int inode_logged(const struct btrfs_trans_handle *trans,
33823382
struct btrfs_key key;
33833383
int ret;
33843384

3385-
if (inode->logged_trans == trans->transid)
3385+
/*
3386+
* Quick lockless call, since once ->logged_trans is set to the current
3387+
* transaction, we never set it to a lower value anywhere else.
3388+
*/
3389+
if (data_race(inode->logged_trans) == trans->transid)
33863390
return 1;
33873391

33883392
/*
3389-
* If logged_trans is not 0, then we know the inode logged was not logged
3390-
* in this transaction, so we can return false right away.
3393+
* If logged_trans is not 0 and not trans->transid, then we know the
3394+
* inode was not logged in this transaction, so we can return false
3395+
* right away. We take the lock to avoid a race caused by load/store
3396+
* tearing with a concurrent btrfs_log_inode() call or a concurrent task
3397+
* in this function further below - an update to trans->transid can be
3398+
* teared into two 32 bits updates for example, in which case we could
3399+
* see a positive value that is not trans->transid and assume the inode
3400+
* was not logged when it was.
33913401
*/
3392-
if (inode->logged_trans > 0)
3402+
spin_lock(&inode->lock);
3403+
if (inode->logged_trans == trans->transid) {
3404+
spin_unlock(&inode->lock);
3405+
return 1;
3406+
} else if (inode->logged_trans > 0) {
3407+
spin_unlock(&inode->lock);
33933408
return 0;
3409+
}
3410+
spin_unlock(&inode->lock);
33943411

33953412
/*
33963413
* If no log tree was created for this root in this transaction, then

0 commit comments

Comments
 (0)