Instant reply?
#1800
Replies: 1 comment
-
Is performance your concern, or just aesthetics? Assuming we are doing things similarly to how llama.cpp's 'main' example does it, there shouldn't be a difference in performance. I know TGWUI and koboldcpp seem to be slower in streaming mode, but I think that's because of some client UI lock-step or lag for the former, and possibly redundant llama.cpp operations for the latter. Open a feature request if it's for aesthetics - discussions are not the right place for this. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Is it possible to turn it off from "typing like a human" and post the whole answer all at once?
Right now it gives me bits by bits of the answer and its very slow, I would assume its my hardware but rather than doing it that way, would there be an option where it buffers the whole answer and while it does that u just see a processing or loading wheel and once its done it post the whole thing at once?
Beta Was this translation helpful? Give feedback.
All reactions