Skip to content

Commit e6b9bd7

Browse files
Zhihao Chengtytso
authored andcommitted
jbd2: fix data missing when reusing bh which is ready to be checkpointed
Following process will make data lost and could lead to a filesystem corrupted problem: 1. jh(bh) is inserted into T1->t_checkpoint_list, bh is dirty, and jh->b_transaction = NULL 2. T1 is added into journal->j_checkpoint_transactions. 3. Get bh prepare to write while doing checkpoing: PA PB do_get_write_access jbd2_log_do_checkpoint spin_lock(&jh->b_state_lock) if (buffer_dirty(bh)) clear_buffer_dirty(bh) // clear buffer dirty set_buffer_jbddirty(bh) transaction = journal->j_checkpoint_transactions jh = transaction->t_checkpoint_list if (!buffer_dirty(bh)) __jbd2_journal_remove_checkpoint(jh) // bh won't be flushed jbd2_cleanup_journal_tail __jbd2_journal_file_buffer(jh, transaction, BJ_Reserved) 4. Aborting journal/Power-cut before writing latest bh on journal area. In this way we get a corrupted filesystem with bh's data lost. Fix it by moving the clearing of buffer_dirty bit just before the call to __jbd2_journal_file_buffer(), both bit clearing and jh->b_transaction assignment are under journal->j_list_lock locked, so that jbd2_log_do_checkpoint() will wait until jh's new transaction fininshed even bh is currently not dirty. And journal_shrink_one_cp_list() won't remove jh from checkpoint list if the buffer head is reused in do_get_write_access(). Fetch a reproducer in [Link]. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216898 Cc: <[email protected]> Signed-off-by: Zhihao Cheng <[email protected]> Signed-off-by: zhanchengbin <[email protected]> Suggested-by: Jan Kara <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
1 parent 3039d8b commit e6b9bd7

File tree

1 file changed

+29
-21
lines changed

1 file changed

+29
-21
lines changed

fs/jbd2/transaction.c

Lines changed: 29 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1010,36 +1010,28 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
10101010
* ie. locked but not dirty) or tune2fs (which may actually have
10111011
* the buffer dirtied, ugh.) */
10121012

1013-
if (buffer_dirty(bh)) {
1013+
if (buffer_dirty(bh) && jh->b_transaction) {
1014+
warn_dirty_buffer(bh);
10141015
/*
1015-
* First question: is this buffer already part of the current
1016-
* transaction or the existing committing transaction?
1017-
*/
1018-
if (jh->b_transaction) {
1019-
J_ASSERT_JH(jh,
1020-
jh->b_transaction == transaction ||
1021-
jh->b_transaction ==
1022-
journal->j_committing_transaction);
1023-
if (jh->b_next_transaction)
1024-
J_ASSERT_JH(jh, jh->b_next_transaction ==
1025-
transaction);
1026-
warn_dirty_buffer(bh);
1027-
}
1028-
/*
1029-
* In any case we need to clean the dirty flag and we must
1030-
* do it under the buffer lock to be sure we don't race
1031-
* with running write-out.
1016+
* We need to clean the dirty flag and we must do it under the
1017+
* buffer lock to be sure we don't race with running write-out.
10321018
*/
10331019
JBUFFER_TRACE(jh, "Journalling dirty buffer");
10341020
clear_buffer_dirty(bh);
1021+
/*
1022+
* The buffer is going to be added to BJ_Reserved list now and
1023+
* nothing guarantees jbd2_journal_dirty_metadata() will be
1024+
* ever called for it. So we need to set jbddirty bit here to
1025+
* make sure the buffer is dirtied and written out when the
1026+
* journaling machinery is done with it.
1027+
*/
10351028
set_buffer_jbddirty(bh);
10361029
}
10371030

1038-
unlock_buffer(bh);
1039-
10401031
error = -EROFS;
10411032
if (is_handle_aborted(handle)) {
10421033
spin_unlock(&jh->b_state_lock);
1034+
unlock_buffer(bh);
10431035
goto out;
10441036
}
10451037
error = 0;
@@ -1049,8 +1041,10 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
10491041
* b_next_transaction points to it
10501042
*/
10511043
if (jh->b_transaction == transaction ||
1052-
jh->b_next_transaction == transaction)
1044+
jh->b_next_transaction == transaction) {
1045+
unlock_buffer(bh);
10531046
goto done;
1047+
}
10541048

10551049
/*
10561050
* this is the first time this transaction is touching this buffer,
@@ -1074,10 +1068,24 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
10741068
*/
10751069
smp_wmb();
10761070
spin_lock(&journal->j_list_lock);
1071+
if (test_clear_buffer_dirty(bh)) {
1072+
/*
1073+
* Execute buffer dirty clearing and jh->b_transaction
1074+
* assignment under journal->j_list_lock locked to
1075+
* prevent bh being removed from checkpoint list if
1076+
* the buffer is in an intermediate state (not dirty
1077+
* and jh->b_transaction is NULL).
1078+
*/
1079+
JBUFFER_TRACE(jh, "Journalling dirty buffer");
1080+
set_buffer_jbddirty(bh);
1081+
}
10771082
__jbd2_journal_file_buffer(jh, transaction, BJ_Reserved);
10781083
spin_unlock(&journal->j_list_lock);
1084+
unlock_buffer(bh);
10791085
goto done;
10801086
}
1087+
unlock_buffer(bh);
1088+
10811089
/*
10821090
* If there is already a copy-out version of this buffer, then we don't
10831091
* need to make another one

0 commit comments

Comments
 (0)