Skip to content

Commit 1a11a65

Browse files
YiPeng Chaialexdeucher
authored andcommitted
drm/amdgpu: Enable mode-1 reset for RAS recovery in fatal error mode
The patch is enabling mode-1 reset for RAS recovery in fatal error mode. Signed-off-by: YiPeng Chai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
1 parent 64a3dbb commit 1a11a65

File tree

2 files changed

+10
-1
lines changed

2 files changed

+10
-1
lines changed

drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4593,6 +4593,10 @@ bool amdgpu_device_should_recover_gpu(struct amdgpu_device *adev)
45934593
if (amdgpu_gpu_recovery == 0)
45944594
goto disabled;
45954595

4596+
/* Skip soft reset check in fatal error mode */
4597+
if (!amdgpu_ras_is_poison_mode_supported(adev))
4598+
return true;
4599+
45964600
if (!amdgpu_device_ip_check_soft_reset(adev)) {
45974601
dev_info(adev->dev,"Timeout, but no hardware hang detected.\n");
45984602
return false;

drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1948,7 +1948,12 @@ static void amdgpu_ras_do_recovery(struct work_struct *work)
19481948

19491949
reset_context.method = AMD_RESET_METHOD_NONE;
19501950
reset_context.reset_req_dev = adev;
1951-
clear_bit(AMDGPU_NEED_FULL_RESET, &reset_context.flags);
1951+
1952+
/* Perform full reset in fatal error mode */
1953+
if (!amdgpu_ras_is_poison_mode_supported(ras->adev))
1954+
set_bit(AMDGPU_NEED_FULL_RESET, &reset_context.flags);
1955+
else
1956+
clear_bit(AMDGPU_NEED_FULL_RESET, &reset_context.flags);
19521957

19531958
amdgpu_device_gpu_recover(ras->adev, NULL, &reset_context);
19541959
}

0 commit comments

Comments
 (0)