Skip to content
Discussion options

You must be logged in to vote

Can't guarantee it will work, but I think you just have to call llama_kv_cache_tokens_rm(ctx, -1, -1); before every new input

Replies: 5 comments 4 replies

Comment options

You must be logged in to vote
1 reply
@trzy
Comment options

Comment options

You must be logged in to vote
1 reply
@trzy
Comment options

Answer selected by trzy
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
2 replies
@solix
Comment options

@syntheticgio
Comment options

Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
7 participants