Skip to content

Commit 8af39ec

Browse files
committed
drm/sched: Increment job count before swapping tail spsc queue
A small race exists between spsc_queue_push and the run-job worker, in which spsc_queue_push may return not-first while the run-job worker has already idled due to the job count being zero. If this race occurs, job scheduling stops, leading to hangs while waiting on the job’s DMA fences. Seal this race by incrementing the job count before appending to the SPSC queue. This race was observed on a drm-tip 6.16-rc1 build with the Xe driver in an SVM test case. Fixes: 1b1f42d ("drm: move amd_gpu_scheduler into common location") Fixes: 27105db ("drm/amdgpu: Add SPSC queue to scheduler.") Cc: [email protected] Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Jonathan Cavitt <[email protected]> Link: https://lore.kernel.org/r/[email protected]
1 parent 7dfede7 commit 8af39ec

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

include/drm/spsc_queue.h

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,9 +70,11 @@ static inline bool spsc_queue_push(struct spsc_queue *queue, struct spsc_node *n
7070

7171
preempt_disable();
7272

73+
atomic_inc(&queue->job_count);
74+
smp_mb__after_atomic();
75+
7376
tail = (struct spsc_node **)atomic_long_xchg(&queue->tail, (long)&node->next);
7477
WRITE_ONCE(*tail, node);
75-
atomic_inc(&queue->job_count);
7678

7779
/*
7880
* In case of first element verify new node will be visible to the consumer

0 commit comments

Comments
 (0)