Skip to content

Commit 77f24ab

Browse files
committed
Unload configs based on model ID and for both modes.
Follow-up to #98. Signed-off-by: Jacob Howard <jacob.howard@docker.com>
1 parent 6a695dc commit 77f24ab

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

pkg/inference/scheduling/loader.go

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -263,7 +263,8 @@ func (l *loader) Unload(ctx context.Context, unload UnloadRequest) int {
263263
} else {
264264
for _, model := range unload.Models {
265265
modelID := l.modelManager.ResolveModelID(model)
266-
delete(l.runnerConfigs, runnerKey{unload.Backend, model, inference.BackendModeCompletion})
266+
delete(l.runnerConfigs, runnerKey{unload.Backend, modelID, inference.BackendModeCompletion})
267+
delete(l.runnerConfigs, runnerKey{unload.Backend, modelID, inference.BackendModeEmbedding})
267268
// Evict both, completion and embedding models. We should consider
268269
// accepting a mode parameter in unload requests.
269270
l.evictRunner(unload.Backend, modelID, inference.BackendModeCompletion)

0 commit comments

Comments
 (0)