Skip to content

Commit c466e33

Browse files
fdmananakdave
authored andcommitted
btrfs: propagate last_unlink_trans earlier when doing a rmdir
In case the removed directory had a snapshot that was deleted, we are propagating its inode's last_unlink_trans to the parent directory after we removed the entry from the parent directory. This leaves a small race window where someone can log the parent directory after we removed the entry and before we updated last_unlink_trans, and as a result if we ever try to replay such a log tree, we will fail since we will attempt to remove a snapshot during log replay, which is currently not possible and results in the log replay (and mount) to fail. This is the type of failure described in commit 1ec9a1a ("Btrfs: fix unreplayable log after snapshot delete + parent dir fsync"). So fix this by propagating the last_unlink_trans to the parent directory before we remove the entry from it. Fixes: 44f714d ("Btrfs: improve performance on fsync against new inode after rename/unlink") Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
1 parent bf5bcf9 commit c466e33

File tree

1 file changed

+18
-18
lines changed

1 file changed

+18
-18
lines changed

fs/btrfs/inode.c

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -4710,7 +4710,6 @@ static int btrfs_rmdir(struct inode *dir, struct dentry *dentry)
47104710
struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info;
47114711
int ret = 0;
47124712
struct btrfs_trans_handle *trans;
4713-
u64 last_unlink_trans;
47144713
struct fscrypt_name fname;
47154714

47164715
if (inode->i_size > BTRFS_EMPTY_DIR_SIZE)
@@ -4736,6 +4735,23 @@ static int btrfs_rmdir(struct inode *dir, struct dentry *dentry)
47364735
goto out_notrans;
47374736
}
47384737

4738+
/*
4739+
* Propagate the last_unlink_trans value of the deleted dir to its
4740+
* parent directory. This is to prevent an unrecoverable log tree in the
4741+
* case we do something like this:
4742+
* 1) create dir foo
4743+
* 2) create snapshot under dir foo
4744+
* 3) delete the snapshot
4745+
* 4) rmdir foo
4746+
* 5) mkdir foo
4747+
* 6) fsync foo or some file inside foo
4748+
*
4749+
* This is because we can't unlink other roots when replaying the dir
4750+
* deletes for directory foo.
4751+
*/
4752+
if (BTRFS_I(inode)->last_unlink_trans >= trans->transid)
4753+
BTRFS_I(dir)->last_unlink_trans = BTRFS_I(inode)->last_unlink_trans;
4754+
47394755
if (unlikely(btrfs_ino(BTRFS_I(inode)) == BTRFS_EMPTY_SUBVOL_DIR_OBJECTID)) {
47404756
ret = btrfs_unlink_subvol(trans, BTRFS_I(dir), dentry);
47414757
goto out;
@@ -4745,27 +4761,11 @@ static int btrfs_rmdir(struct inode *dir, struct dentry *dentry)
47454761
if (ret)
47464762
goto out;
47474763

4748-
last_unlink_trans = BTRFS_I(inode)->last_unlink_trans;
4749-
47504764
/* now the directory is empty */
47514765
ret = btrfs_unlink_inode(trans, BTRFS_I(dir), BTRFS_I(d_inode(dentry)),
47524766
&fname.disk_name);
4753-
if (!ret) {
4767+
if (!ret)
47544768
btrfs_i_size_write(BTRFS_I(inode), 0);
4755-
/*
4756-
* Propagate the last_unlink_trans value of the deleted dir to
4757-
* its parent directory. This is to prevent an unrecoverable
4758-
* log tree in the case we do something like this:
4759-
* 1) create dir foo
4760-
* 2) create snapshot under dir foo
4761-
* 3) delete the snapshot
4762-
* 4) rmdir foo
4763-
* 5) mkdir foo
4764-
* 6) fsync foo or some file inside foo
4765-
*/
4766-
if (last_unlink_trans >= trans->transid)
4767-
BTRFS_I(dir)->last_unlink_trans = last_unlink_trans;
4768-
}
47694769
out:
47704770
btrfs_end_transaction(trans);
47714771
out_notrans:

0 commit comments

Comments
 (0)