-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Description
I extract this discussion from #13367 , mainly for better planning tasks around this.
allow loading / unloading model via API: in
server.cpp
, we can add a kinda "super"main()
function that wraps around the currentmain()
. The new main will spawn an "interim" HTTP server that expose the API to load a model. Ofc this functionality will be restricted to local deployment to avoid any security issues.
This idea has been demo in #13400 , but the implementation is still far from usable. It actually requires a refactoring of server.
While alternative methods for hot-swapping model already exist, I think refactoring the server.cpp
code can still benefit the long-term development quite a lot. Therefore, this feature can potentially be a suitable goal for the refactoring efforts.