Kilo Code appears to be stuck in a loop, attempting the same action (apply_diff) repeatedly. #1423
Replies: 3 comments
-
I've noticed this happens on the same spot context wise for every single model. For the GPT models it seems to happen at 16.3k while for gemini it seems to happen around 88k. The key here is how consistent it is. I was running my own tests with llama.cpp and also ollama so I could watch what's being sent to the server. Just to say hello is over 10k of context tokens (imagine using 30k words to say hello). Local models have a default context length of 2048 or 4096. Seeing the pattern here? The fact is we're getting close to an edge where one type of memory gives way to the other. You either have to blow through that by generated up to 4k new tokens, or reduce context to stay under it. You can't just trip the light fantastic on the edge of a rope. Here's what I'm doing and it seems to work because it generates enough tokens to get us off the edge.
Doing this gets the model a page or two past that barrier point and it won't come back until we get close to the next edge which seems to be around 32k, 64k etc. I'll bet the 88k of gemini is a similar issue. |
Beta Was this translation helpful? Give feedback.
-
this pr should fix this with Morphs fast apply model! #1428 |
Beta Was this translation helpful? Give feedback.
-
this is what Usage Overview shows, i think i should configure limit request to 20-30 ? what is the impact if we limit ? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Kilo Code is having trouble...
Kilo Code appears to be stuck in a loop, attempting the same action (apply_diff) repeatedly. This might indicate a problem with its current strategy. Consider rephrasing the task, providing more specific instructions, or guiding it towards a different approach.
Beta Was this translation helpful? Give feedback.
All reactions