Skip to content

Commit 977c55e

Browse files
committed
unload: only unload unused runners
We can't unload runners that are in-use, otherwise state corruption will occur in release (which will likely result in a panic() at some point on subsequent inference). If we want a preemption mechanism, additional refactoring will need to occur. Signed-off-by: Jacob Howard <[email protected]>
1 parent 442f049 commit 977c55e

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

pkg/inference/scheduling/loader.go

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -182,7 +182,8 @@ func (l *loader) evict(idleOnly bool) int {
182182
func (l *loader) evictRunner(backend, model string) int {
183183
allBackends := backend == ""
184184
for r, slot := range l.runners {
185-
if (allBackends || r.backend == backend) && r.model == model {
185+
unused := l.references[slot] == 0
186+
if unused && (allBackends || r.backend == backend) && r.model == model {
186187
l.log.Infof("Evicting %s backend runner with model %s in %s mode",
187188
r.backend, r.model, r.mode,
188189
)

0 commit comments

Comments
 (0)