-
Notifications
You must be signed in to change notification settings - Fork 488
Open
Description
Description
The error occurs when filling in the context.
Tried the options InferenceParams:
TokensKeep: -1, 0, value
Code:
...
InteractiveExecutor executor = new(context, logger);
...
await foreach (string text in session.ChatAsync(new ChatHistory.Message(AuthorRole.User, query), false, inferenceParams, cancellationToken))
{
}
Critical error:
D:\a\LLamaSharp\LLamaSharp\src\llama-kv-cache.cpp:398: GGML_ASSERT(hparams.n_pos_per_embd() == 1 && "seq_add() is only supported for n_pos_per_embd() == 1") failed
Environment & Configuration
- Operating system: Windows 11
- .NET runtime version: NET 10.0.3
- LLamaSharp version: 0.26.0
- CUDA version (if you are using cuda backend): 12.9
- CPU & GPU device: GPU RTX 5090
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels