Skip to content

Commit 948b1fe

Browse files
zhaohemYuKuai-huawei
authored andcommitted
md/md-cluster: handle REMOVE message earlier
Commit a1fd37f ("md: Don't wait for MD_RECOVERY_NEEDED for HOT_REMOVE_DISK ioctl") introduced a regression in the md_cluster module. (Failed cases 02r1_Manage_re-add & 02r10_Manage_re-add) Consider a 2-node cluster: - node1 set faulty & remove command on a disk. - node2 must correctly update the array metadata. Before a1fd37f, on node1, the delay between msg:METADATA_UPDATED (triggered by faulty) and msg:REMOVE was sufficient for node2 to reload the disk info (written by node1). After a1fd37f, node1 no longer waits between faulty and remove, causing it to send msg:REMOVE while node2 is still reloading disk info. This often results in node2 failing to remove the faulty disk. == how to trigger == set up a 2-node cluster (node1 & node2) with disks vdc & vdd. on node1: mdadm -CR /dev/md0 -l1 -b clustered -n2 /dev/vdc /dev/vdd --assume-clean ssh node2-ip mdadm -A /dev/md0 /dev/vdc /dev/vdd mdadm --manage /dev/md0 --fail /dev/vdc --remove /dev/vdc check array status on both nodes with "mdadm -D /dev/md0". node1 output: Number Major Minor RaidDevice State - 0 0 0 removed 1 254 48 1 active sync /dev/vdd node2 output: Number Major Minor RaidDevice State - 0 0 0 removed 1 254 48 1 active sync /dev/vdd 0 254 32 - faulty /dev/vdc Fixes: a1fd37f ("md: Don't wait for MD_RECOVERY_NEEDED for HOT_REMOVE_DISK ioctl") Signed-off-by: Heming Zhao <[email protected]> Reviewed-by: Su Yue <[email protected]> Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Yu Kuai <[email protected]>
1 parent 1df1fc8 commit 948b1fe

File tree

1 file changed

+6
-3
lines changed

1 file changed

+6
-3
lines changed

drivers/md/md.c

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9777,8 +9777,8 @@ void md_check_recovery(struct mddev *mddev)
97779777
* remove disk.
97789778
*/
97799779
rdev_for_each_safe(rdev, tmp, mddev) {
9780-
if (test_and_clear_bit(ClusterRemove, &rdev->flags) &&
9781-
rdev->raid_disk < 0)
9780+
if (rdev->raid_disk < 0 &&
9781+
test_and_clear_bit(ClusterRemove, &rdev->flags))
97829782
md_kick_rdev_from_array(rdev);
97839783
}
97849784
}
@@ -10084,8 +10084,11 @@ static void check_sb_changes(struct mddev *mddev, struct md_rdev *rdev)
1008410084

1008510085
/* Check for change of roles in the active devices */
1008610086
rdev_for_each_safe(rdev2, tmp, mddev) {
10087-
if (test_bit(Faulty, &rdev2->flags))
10087+
if (test_bit(Faulty, &rdev2->flags)) {
10088+
if (test_bit(ClusterRemove, &rdev2->flags))
10089+
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
1008810090
continue;
10091+
}
1008910092

1009010093
/* Check if the roles changed */
1009110094
role = le16_to_cpu(sb->dev_roles[rdev2->desc_nr]);

0 commit comments

Comments
 (0)