Allow adjustable timeouts on Ollama and OpenAI compatible endpoints #1375
devlux76
started this conversation in
2. Design improvements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have a very nice model I'm trying to use. Unfortunately, with an 8k default initial prompt size it takes a bit to get to first token.
This is making me cry...
I can't do anything about how slow ollama and llama.cpp are being with this model, but hopefully with enough tears shed here the Kilocode gods will bless me with an adjustable timeout, Thanks so much!
Beta Was this translation helpful? Give feedback.
All reactions