-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Allow viewing conversations even when llama server is down #16255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow viewing conversations even when llama server is down #16255
Conversation
…-server is down - Cached llama.cpp server properties in browser localStorage on startup, persisting successful fetches and reloading them when refresh attempts fail so the chat UI continues to render while the backend is unavailable. - Cleared the stored server properties when resetting the store to prevent stale capability data after cache-backed operation. - Kept the original error-splash behavior when no cached props exist so fresh installs still surface a clear failure state instead of rendering stale data.
|
It looks like only 500 errors currently trigger the cached props fallback, but not offline cases. This modification 74313a6 is a proposal to also treat common network failures (connection refused, DNS issues, timeouts) and HTTP 5xx as "props unavailable" when cached props exist. That way the yellow "Server /props endpoint not available - using cached data" banner appears both for 500 errors and when the server is offline, while first-load installs without cache still show the red error screen. |
|
We currently cannot reactivate llama-server via llama-swap through the web interface, as the input area is grayed out. (This matters for self-hosting scenarios: geeks like me want to run this lightweight WebUI without exposing the llama-swap admin endpoint or /ui to the internet (attack surface). On top of that, llama-swap currently does not support relative URLs, which makes it harder to host safely behind a proxy.) |
Treat connection failures (refused, DNS, timeout, fetch) the same way as server 5xx so the warning banner shows up when cache is available, instead of falling back to a full error screen.
… operators can keep sending messages e.g., to restart the backend over llama-swap, even while cached /props data is in use
|
Hey, @ServeurpersoCom, these all are very valid arguments and sensible code changes! Thank you for contributing! I've cherry-picked your proposals and included them in this PR 🙂 |
…ng-conversations-even-when-llama-server-is-down
…16255) * webui: allow viewing conversations and sending messages even if llama-server is down - Cached llama.cpp server properties in browser localStorage on startup, persisting successful fetches and reloading them when refresh attempts fail so the chat UI continues to render while the backend is unavailable. - Cleared the stored server properties when resetting the store to prevent stale capability data after cache-backed operation. - Kept the original error-splash behavior when no cached props exist so fresh installs still surface a clear failure state instead of rendering stale data. * feat: Add UI for `props` endpoint unavailable + cleanup logic * webui: extend cached props fallback to offline errors Treat connection failures (refused, DNS, timeout, fetch) the same way as server 5xx so the warning banner shows up when cache is available, instead of falling back to a full error screen. * webui: Left the chat form enabled when a server warning is present so operators can keep sending messages e.g., to restart the backend over llama-swap, even while cached /props data is in use * chore: update webui build output --------- Co-authored-by: Pascal <[email protected]>
…16255) * webui: allow viewing conversations and sending messages even if llama-server is down - Cached llama.cpp server properties in browser localStorage on startup, persisting successful fetches and reloading them when refresh attempts fail so the chat UI continues to render while the backend is unavailable. - Cleared the stored server properties when resetting the store to prevent stale capability data after cache-backed operation. - Kept the original error-splash behavior when no cached props exist so fresh installs still surface a clear failure state instead of rendering stale data. * feat: Add UI for `props` endpoint unavailable + cleanup logic * webui: extend cached props fallback to offline errors Treat connection failures (refused, DNS, timeout, fetch) the same way as server 5xx so the warning banner shows up when cache is available, instead of falling back to a full error screen. * webui: Left the chat form enabled when a server warning is present so operators can keep sending messages e.g., to restart the backend over llama-swap, even while cached /props data is in use * chore: update webui build output --------- Co-authored-by: Pascal <[email protected]>
…16255) * webui: allow viewing conversations and sending messages even if llama-server is down - Cached llama.cpp server properties in browser localStorage on startup, persisting successful fetches and reloading them when refresh attempts fail so the chat UI continues to render while the backend is unavailable. - Cleared the stored server properties when resetting the store to prevent stale capability data after cache-backed operation. - Kept the original error-splash behavior when no cached props exist so fresh installs still surface a clear failure state instead of rendering stale data. * feat: Add UI for `props` endpoint unavailable + cleanup logic * webui: extend cached props fallback to offline errors Treat connection failures (refused, DNS, timeout, fetch) the same way as server 5xx so the warning banner shows up when cache is available, instead of falling back to a full error screen. * webui: Left the chat form enabled when a server warning is present so operators can keep sending messages e.g., to restart the backend over llama-swap, even while cached /props data is in use * chore: update webui build output --------- Co-authored-by: Pascal <[email protected]>
Close #16120
Implements caching
/propsresponse and gracefully failing when thellama-serveris down.Cherry-picked diff from @ServeurpersoCom's proposal1, and added UI for the "offline" state.
Footnotes
https://github.com/ggml-org/llama.cpp/compare/master...ServeurpersoCom:llama.cpp:webui-cache-offline-chat ↩