Skip to content

Conversation

@allozaur
Copy link
Collaborator

Close #16120

Implements caching /props response and gracefully failing when the llama-server is down.

Cherry-picked diff from @ServeurpersoCom's proposal1, and added UI for the "offline" state.

Zrzut ekranu 2025-09-25 o 16 39 28

Footnotes

  1. https://github.com/ggml-org/llama.cpp/compare/master...ServeurpersoCom:llama.cpp:webui-cache-offline-chat

ServeurpersoCom and others added 2 commits September 25, 2025 15:53
…-server is down

- Cached llama.cpp server properties in browser localStorage on startup, persisting successful fetches and reloading them when refresh attempts fail so the chat UI continues to render while the backend is unavailable.
- Cleared the stored server properties when resetting the store to prevent stale capability data after cache-backed operation.
- Kept the original error-splash behavior when no cached props exist so fresh installs still surface a clear failure state instead of rendering stale data.
@ServeurpersoCom
Copy link
Collaborator

It looks like only 500 errors currently trigger the cached props fallback, but not offline cases. This modification 74313a6 is a proposal to also treat common network failures (connection refused, DNS issues, timeouts) and HTTP 5xx as "props unavailable" when cached props exist.

That way the yellow "Server /props endpoint not available - using cached data" banner appears both for 500 errors and when the server is offline, while first-load installs without cache still show the red error screen.

@ServeurpersoCom
Copy link
Collaborator

We currently cannot reactivate llama-server via llama-swap through the web interface, as the input area is grayed out.
This means that if llama-swap fails to load llama-server, or if it crashes, we end up deadlocked.
I suggest not blocking the ability to send messages : 9509b45

(This matters for self-hosting scenarios: geeks like me want to run this lightweight WebUI without exposing the llama-swap admin endpoint or /ui to the internet (attack surface). On top of that, llama-swap currently does not support relative URLs, which makes it harder to host safely behind a proxy.)

ServeurpersoCom and others added 3 commits September 26, 2025 03:08
Treat connection failures (refused, DNS, timeout, fetch) the same way as
server 5xx so the warning banner shows up when cache is available, instead
of falling back to a full error screen.
… operators can keep sending messages

e.g., to restart the backend over llama-swap, even while cached /props data is in use
@allozaur
Copy link
Collaborator Author

Hey, @ServeurpersoCom, these all are very valid arguments and sensible code changes! Thank you for contributing! I've cherry-picked your proposals and included them in this PR 🙂

@allozaur allozaur marked this pull request as ready for review September 26, 2025 01:11
…ng-conversations-even-when-llama-server-is-down
@allozaur allozaur merged commit 1a18927 into ggml-org:master Sep 26, 2025
14 checks passed
struct pushed a commit to struct/llama.cpp that referenced this pull request Sep 26, 2025
…16255)

* webui: allow viewing conversations and sending messages even if llama-server is down

- Cached llama.cpp server properties in browser localStorage on startup, persisting successful fetches and reloading them when refresh attempts fail so the chat UI continues to render while the backend is unavailable.
- Cleared the stored server properties when resetting the store to prevent stale capability data after cache-backed operation.
- Kept the original error-splash behavior when no cached props exist so fresh installs still surface a clear failure state instead of rendering stale data.

* feat: Add UI for `props` endpoint unavailable + cleanup logic

* webui: extend cached props fallback to offline errors

Treat connection failures (refused, DNS, timeout, fetch) the same way as
server 5xx so the warning banner shows up when cache is available, instead
of falling back to a full error screen.

* webui: Left the chat form enabled when a server warning is present so operators can keep sending messages

e.g., to restart the backend over llama-swap, even while cached /props data is in use

* chore: update webui build output

---------

Co-authored-by: Pascal <[email protected]>
@allozaur allozaur deleted the 16120-allow-viewing-conversations-even-when-llama-server-is-down branch September 29, 2025 16:55
yael-works pushed a commit to yael-works/llama.cpp that referenced this pull request Oct 15, 2025
…16255)

* webui: allow viewing conversations and sending messages even if llama-server is down

- Cached llama.cpp server properties in browser localStorage on startup, persisting successful fetches and reloading them when refresh attempts fail so the chat UI continues to render while the backend is unavailable.
- Cleared the stored server properties when resetting the store to prevent stale capability data after cache-backed operation.
- Kept the original error-splash behavior when no cached props exist so fresh installs still surface a clear failure state instead of rendering stale data.

* feat: Add UI for `props` endpoint unavailable + cleanup logic

* webui: extend cached props fallback to offline errors

Treat connection failures (refused, DNS, timeout, fetch) the same way as
server 5xx so the warning banner shows up when cache is available, instead
of falling back to a full error screen.

* webui: Left the chat form enabled when a server warning is present so operators can keep sending messages

e.g., to restart the backend over llama-swap, even while cached /props data is in use

* chore: update webui build output

---------

Co-authored-by: Pascal <[email protected]>
pwilkin pushed a commit to pwilkin/llama.cpp that referenced this pull request Oct 23, 2025
…16255)

* webui: allow viewing conversations and sending messages even if llama-server is down

- Cached llama.cpp server properties in browser localStorage on startup, persisting successful fetches and reloading them when refresh attempts fail so the chat UI continues to render while the backend is unavailable.
- Cleared the stored server properties when resetting the store to prevent stale capability data after cache-backed operation.
- Kept the original error-splash behavior when no cached props exist so fresh installs still surface a clear failure state instead of rendering stale data.

* feat: Add UI for `props` endpoint unavailable + cleanup logic

* webui: extend cached props fallback to offline errors

Treat connection failures (refused, DNS, timeout, fetch) the same way as
server 5xx so the warning banner shows up when cache is available, instead
of falling back to a full error screen.

* webui: Left the chat form enabled when a server warning is present so operators can keep sending messages

e.g., to restart the backend over llama-swap, even while cached /props data is in use

* chore: update webui build output

---------

Co-authored-by: Pascal <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misc. bug: webui: allow viewing conversations local storage and sending messages even if llama-server is down

3 participants