Skip to content

Commit 033c564

Browse files
YuBiao Wangalexdeucher
authored andcommitted
drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs
[Why] For engines not supporting soft reset, i.e. VCN, there will be a failed ib test before mode 1 reset during asic reset. The fences in this case are never signaled and next time when we try to free the sa_bo, kernel will hang. [How] During pre_asic_reset, driver will clear job fences and afterwards the fences' refcount will be reduced to 1. For drm_sched_jobs it will be released in job_free_cb, and for non-sched jobs like ib_test, it's meant to be released in sa_bo_free but only when the fences are signaled. So we have to force signal the non_sched bad job's fence during pre_asic_reset or the clear is not complete. Signed-off-by: YuBiao Wang <[email protected]> Acked-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
1 parent 4eb0b49 commit 033c564

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -678,6 +678,15 @@ void amdgpu_fence_driver_clear_job_fences(struct amdgpu_ring *ring)
678678
ptr = &ring->fence_drv.fences[i];
679679
old = rcu_dereference_protected(*ptr, 1);
680680
if (old && old->ops == &amdgpu_job_fence_ops) {
681+
struct amdgpu_job *job;
682+
683+
/* For non-scheduler bad job, i.e. failed ib test, we need to signal
684+
* it right here or we won't be able to track them in fence_drv
685+
* and they will remain unsignaled during sa_bo free.
686+
*/
687+
job = container_of(old, struct amdgpu_job, hw_fence);
688+
if (!job->base.s_fence && !dma_fence_is_signaled(old))
689+
dma_fence_signal(old);
681690
RCU_INIT_POINTER(*ptr, NULL);
682691
dma_fence_put(old);
683692
}

0 commit comments

Comments
 (0)