-
Notifications
You must be signed in to change notification settings - Fork 13.7k
webui: auto-refresh /props on inference start to resync model metadata #16784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
webui: auto-refresh /props on inference start to resync model metadata #16784
Conversation
2e9e0eb to
9397c4d
Compare
|
The final fix is done, now that the /props refresh is unified, the legacy mode (without the model selector) behaves correctly. |
|
LEGACY MODE (modelSelectorEnabled = false)
MODEL SELECTOR MODE (modelSelectorEnabled = true)
Summary: |
98b76cc to
8da3c0e
Compare
allozaur
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, @ServeurpersoCom, that is much better than the initial idea for this PR 😄
Good thing the merge stashes everything, those intermediate commits were an emotional rollercoaster 😆 Glad we made it out alive 😆 |
🤣 |
ab8bbf0 to
1ee8d6b
Compare
|
rebased to retry CI |
- Add no-cache headers to /props and /slots - Throttle slot checks to 30s - Prevent concurrent fetches with promise guard - Trigger refresh from chat streaming for legacy and ModelSelector - Show dynamic serverWarning when using cached data
…refresh Updated assistant message bubbles to show each message's stored model when available, falling back to the current server model only when the per-message value is missing When the model selector is disabled, now fetches /props and prioritizes that model name over chunk metadata, then persists it with the streamed message so legacy mode properly reflects the backend configuration
1ee8d6b to
b9747cc
Compare
|
Rebased + static build |
Add no-cache headers to /props and /slots
Throttle slot checks to 30s
Prevent concurrent fetches with promise guard
Trigger refresh from chat streaming for legacy and ModelSelector
Show dynamic serverWarning when using cached data
Updated assistant message bubbles to show each message's stored model when available,
falling back to the current server model only when the per-message value is missing
When the model selector is disabled, now fetches /props and prioritizes that model name
over chunk metadata, then persists it with the streamed message so legacy mode properly
reflects the backend configuration
Cmdline used on legacy (Raspberry Pi 5) :
Fixes #16771
EDIT : I've recorded a video that specifically targets the original issue.
Testing video, Raspberry Pi 5 + master branch + this PR :
CmdLineSwap-RaspberryPi5.mp4
And another one to show there's no regression when the model selector is enabled,
also demonstrating the multimodal function updates:
NonReg-FullSetup.mp4