Skip to content

Commit 21a57bc

Browse files
fdmananakdave
authored andcommitted
btrfs: fix race between logging inode and checking if it was logged before
There's a race between checking if an inode was logged before and logging an inode that can cause us to mark an inode as not logged just after it was logged by a concurrent task: 1) We have inode X which was not logged before neither in the current transaction not in past transaction since the inode was loaded into memory, so it's ->logged_trans value is 0; 2) We are at transaction N; 3) Task A calls inode_logged() against inode X, sees that ->logged_trans is 0 and there is a log tree and so it proceeds to search in the log tree for an inode item for inode X. It doesn't see any, but before it sets ->logged_trans to N - 1... 3) Task B calls btrfs_log_inode() against inode X, logs the inode and sets ->logged_trans to N; 4) Task A now sets ->logged_trans to N - 1; 5) At this point anyone calling inode_logged() gets 0 (inode not logged) since ->logged_trans is greater than 0 and less than N, but our inode was really logged. As a consequence operations like rename, unlink and link that happen afterwards in the current transaction end up not updating the log when they should. The same type of race happens in case our inode is a directory when we update the inode's ->last_dir_index_offset field at inode_logged() to (u64)-1, and in that case such a race could make directory logging skip logging new entries at process_dir_items_leaf(), since any new dir entry has an index less than (u64). Fix this by ensuring inode_logged() is always called while holding the inode's log_mutex. Fixes: 0f8ce49 ("btrfs: avoid inode logging during rename and link when possible") Reviewed-by: Boris Burkov <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
1 parent bfe023d commit 21a57bc

File tree

1 file changed

+30
-4
lines changed

1 file changed

+30
-4
lines changed

fs/btrfs/tree-log.c

Lines changed: 30 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3485,14 +3485,27 @@ int btrfs_free_log_root_tree(struct btrfs_trans_handle *trans,
34853485
* Returns 1 if the inode was logged before in the transaction, 0 if it was not,
34863486
* and < 0 on error.
34873487
*/
3488-
static int inode_logged(const struct btrfs_trans_handle *trans,
3489-
struct btrfs_inode *inode,
3490-
struct btrfs_path *path_in)
3488+
static int inode_logged_locked(const struct btrfs_trans_handle *trans,
3489+
struct btrfs_inode *inode,
3490+
struct btrfs_path *path_in)
34913491
{
34923492
struct btrfs_path *path = path_in;
34933493
struct btrfs_key key;
34943494
int ret;
34953495

3496+
/*
3497+
* The log_mutex must be taken to prevent races with concurrent logging
3498+
* as we may see the inode not logged when we are called but it gets
3499+
* logged right after we did not find it in the log tree and we end up
3500+
* setting inode->logged_trans to a value less than trans->transid after
3501+
* the concurrent logging task has set it to trans->transid. As a
3502+
* consequence, subsequent rename, unlink and link operations may end up
3503+
* not logging new names and removing old names from the log.
3504+
* The same type of race also happens if out inode is a directory when
3505+
* we update inode->last_dir_index_offset below.
3506+
*/
3507+
lockdep_assert_held(&inode->log_mutex);
3508+
34963509
if (inode->logged_trans == trans->transid)
34973510
return 1;
34983511

@@ -3594,6 +3607,19 @@ static int inode_logged(const struct btrfs_trans_handle *trans,
35943607
return 1;
35953608
}
35963609

3610+
static inline int inode_logged(const struct btrfs_trans_handle *trans,
3611+
struct btrfs_inode *inode,
3612+
struct btrfs_path *path)
3613+
{
3614+
int ret;
3615+
3616+
mutex_lock(&inode->log_mutex);
3617+
ret = inode_logged_locked(trans, inode, path);
3618+
mutex_unlock(&inode->log_mutex);
3619+
3620+
return ret;
3621+
}
3622+
35973623
/*
35983624
* Delete a directory entry from the log if it exists.
35993625
*
@@ -6678,7 +6704,7 @@ static int btrfs_log_inode(struct btrfs_trans_handle *trans,
66786704
* inode_logged(), because after that we have the need to figure out if
66796705
* the inode was previously logged in this transaction.
66806706
*/
6681-
ret = inode_logged(trans, inode, path);
6707+
ret = inode_logged_locked(trans, inode, path);
66826708
if (ret < 0)
66836709
goto out_unlock;
66846710
ctx->logged_before = (ret == 1);

0 commit comments

Comments
 (0)