-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Open
Labels
Description
Contact Details
What happened?
Use the following method to send a request to an LLM running in llamafile mode:
import requests
k = 30000
s = "(" * k + 'a'
resp = requests.post(
"http://127.0.0.1:8080/completion",
json={
"prompt": "1",
"n_predict": 10,
"grammar": f"root ::= {s}"
}
)
print(resp.json())And you can see the Segmentation fault (core dumped).
Version
llamafile v0.9.3
What operating system are you seeing the problem on?
Linux
Relevant log output
./tinyllama-15M.Q8_0.llamafile
██╗ ██╗ █████╗ ███╗ ███╗ █████╗ ███████╗██╗██╗ ███████╗
██║ ██║ ██╔══██╗████╗ ████║██╔══██╗██╔════╝██║██║ ██╔════╝
██║ ██║ ███████║██╔████╔██║███████║█████╗ ██║██║ █████╗
██║ ██║ ██╔══██║██║╚██╔╝██║██╔══██║██╔══╝ ██║██║ ██╔══╝
███████╗███████╗██║ ██║██║ ╚═╝ ██║██║ ██║██║ ██║███████╗███████╗
╚══════╝╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝╚══════╝╚══════╝
software: llamafile 0.9.3
model: tinyllama-15M.Q8_0.gguf
mode: RAW TEXT COMPLETION (base model)
compute: AMD EPYC 9654 96-Core Processor (znver4)
server: http://127.0.0.1:8080/
>>> Segmentation fault (core dumped)elp for help)Reactions are currently unavailable