Error: could not find a KV slot for the batch #5298

pironev · 2024-05-06T19:23:07Z

pironev
May 6, 2024

Using Llama CPP locally an error occurs after many questions:

node:internal/process/promises:289
            triggerUncaughtException(err, true /* fromPromise */);
            ^
[Error: could not find a KV slot for the batch (try reducing the size of the batch or increase the context)]

The error occurs always when the generated responses exceed approximately 3500 Tokens circa 16000 Characters.
The context is probably too big. How to adjust this? I don't want to enlarge the context but rather have a context only on the last the last 1 or 2 questions.
The error can be simply reproduced by doing something like this:

const model  = new ChatLlamaCpp({ modelPath: llamaPath}) 
const response_1 = await model.stream("Tell me a long story about a cat");
 for await (const chunk of response_1) {
     console.log(chunk.content);
 }
 //----------many other questions here--------------
 const response_7 = await model.stream("Tell me a long story about a horse");
 for await (const chunk of response_7) {
     console.log(chunk.content);
 }

PC used - Apple M1 Max, 32Gb

fce2 · 2024-11-17T08:24:34Z

fce2
Nov 17, 2024

I also get this error (ollama 0.4.2) when using llama2 (with 128k token ctx).
But not after 128k tokens, it's firing way eariler.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error: could not find a KV slot for the batch #5298

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Error: could not find a KV slot for the batch #5298

Uh oh!

Uh oh!

pironev May 6, 2024

Replies: 1 comment

Uh oh!

fce2 Nov 17, 2024

pironev
May 6, 2024

fce2
Nov 17, 2024