Skip to content

Commit 0e38695

Browse files
icklejlahtine-intel
authored andcommitted
drm/i915: Check for awaits on still currently executing requests
With the advent of preempt-to-busy, a request may still be on the GPU as we unwind. And in the case of a unpreemptible [due to HW] request, that request will remain indefinitely on the GPU even though we have returned it back to our submission queue, and cleared the active bit. We only run the execution callbacks on transferring the request from our submission queue to the execution queue, but if this is a bonded request that the HW is waiting for, we will not submit it (as we wait for a fresh execution) even though it is still being executed. As we know that there are always preemption points between requests, we know that only the currently executing request may be still active even though we have cleared the flag. However, we do not precisely know which request is in ELSP[0] due to a delay in processing events, and furthermore we only store the last request in a context in our state tracker. Fixes: 22b7a42 ("drm/i915/execlists: Preempt-to-busy") Testcase: igt/gem_exec_balancer/bonded-dual Signed-off-by: Chris Wilson <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit b55230e) Signed-off-by: Joonas Lahtinen <[email protected]>
1 parent 631a658 commit 0e38695

File tree

1 file changed

+48
-1
lines changed

1 file changed

+48
-1
lines changed

drivers/gpu/drm/i915/i915_request.c

Lines changed: 48 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -357,6 +357,53 @@ void i915_request_retire_upto(struct i915_request *rq)
357357
} while (i915_request_retire(tmp) && tmp != rq);
358358
}
359359

360+
static struct i915_request * const *
361+
__engine_active(struct intel_engine_cs *engine)
362+
{
363+
return READ_ONCE(engine->execlists.active);
364+
}
365+
366+
static bool __request_in_flight(const struct i915_request *signal)
367+
{
368+
struct i915_request * const *port, *rq;
369+
bool inflight = false;
370+
371+
if (!i915_request_is_ready(signal))
372+
return false;
373+
374+
/*
375+
* Even if we have unwound the request, it may still be on
376+
* the GPU (preempt-to-busy). If that request is inside an
377+
* unpreemptible critical section, it will not be removed. Some
378+
* GPU functions may even be stuck waiting for the paired request
379+
* (__await_execution) to be submitted and cannot be preempted
380+
* until the bond is executing.
381+
*
382+
* As we know that there are always preemption points between
383+
* requests, we know that only the currently executing request
384+
* may be still active even though we have cleared the flag.
385+
* However, we can't rely on our tracking of ELSP[0] to known
386+
* which request is currently active and so maybe stuck, as
387+
* the tracking maybe an event behind. Instead assume that
388+
* if the context is still inflight, then it is still active
389+
* even if the active flag has been cleared.
390+
*/
391+
if (!intel_context_inflight(signal->context))
392+
return false;
393+
394+
rcu_read_lock();
395+
for (port = __engine_active(signal->engine); (rq = *port); port++) {
396+
if (rq->context == signal->context) {
397+
inflight = i915_seqno_passed(rq->fence.seqno,
398+
signal->fence.seqno);
399+
break;
400+
}
401+
}
402+
rcu_read_unlock();
403+
404+
return inflight;
405+
}
406+
360407
static int
361408
__await_execution(struct i915_request *rq,
362409
struct i915_request *signal,
@@ -387,7 +434,7 @@ __await_execution(struct i915_request *rq,
387434
}
388435

389436
spin_lock_irq(&signal->lock);
390-
if (i915_request_is_active(signal)) {
437+
if (i915_request_is_active(signal) || __request_in_flight(signal)) {
391438
if (hook) {
392439
hook(rq, &signal->fence);
393440
i915_request_put(signal);

0 commit comments

Comments
 (0)