- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.4k
Closed
Labels
bug-unconfirmedmedium severityUsed to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
Description
What happened?
When using /completion with stream: true, the last 2 JSON chunks come together in Firefox, but Chrome seems to handle it fine, so it might be a Firefox bug.
Looking further into this, it seems like HTTP Transfer-Encoding: chunked requires each chunk to be terminated with \r\n, but here \n\n is used instead:
This doesn't seem to be just a Windows requirement, but listed as part of the HTTP specification:
HTTP Chunked Transfer Coding
More information, including an example chunked response:
Transfer-Encoding Directives
Name and Version
llama-server.exe
version: 3761 (6262d13)
built with MSVC 19.29.30154.0 for x64
What operating system are you seeing the problem on?
Windows
Relevant log output
No response
Metadata
Metadata
Assignees
Labels
bug-unconfirmedmedium severityUsed to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)