Skip to content

Commit 8d5a74a

Browse files
author
Piotr Stankiewicz
committed
Reload defunct runners
In case a runner becomes defunct, e.g. as a result of a backend crash it would be neat to be able to reload it. So, if the loader finds runner, have it check if the runner is still alive, and create a new one if the runner is defunct. Signed-off-by: Piotr Stankiewicz <[email protected]>
1 parent 8aa7a28 commit 8d5a74a

File tree

1 file changed

+10
-3
lines changed

1 file changed

+10
-3
lines changed

pkg/inference/scheduling/loader.go

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -372,9 +372,16 @@ func (l *loader) load(ctx context.Context, backendName, model string, mode infer
372372
// See if we can satisfy the request with an existing runner.
373373
existing, ok := l.runners[runnerKey{backendName, model, mode}]
374374
if ok {
375-
l.references[existing] += 1
376-
l.timestamps[existing] = time.Time{}
377-
return l.slots[existing], nil
375+
select {
376+
case <-l.slots[existing].done:
377+
l.log.Warnf("Will reload defunct %s runner for %s. Runner error: %s.", backendName, model,
378+
l.slots[existing].err)
379+
l.evictRunner(backendName, model)
380+
default:
381+
l.references[existing] += 1
382+
l.timestamps[existing] = time.Time{}
383+
return l.slots[existing], nil
384+
}
378385
}
379386

380387
// If there's not sufficient memory or all slots are full, then try

0 commit comments

Comments
 (0)