Skip to content

Feature request: allow load/unload models on serverΒ #16487

@ngxson

Description

@ngxson

I extract this discussion from #13367 , mainly for better planning tasks around this.

allow loading / unloading model via API: in server.cpp, we can add a kinda "super" main() function that wraps around the current main(). The new main will spawn an "interim" HTTP server that expose the API to load a model. Ofc this functionality will be restricted to local deployment to avoid any security issues.

This idea has been demo in #13400 , but the implementation is still far from usable. It actually requires a refactoring of server.

While alternative methods for hot-swapping model already exist, I think refactoring the server.cpp code can still benefit the long-term development quite a lot. Therefore, this feature can potentially be a suitable goal for the refactoring efforts.

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions