Replies: 2 comments 4 replies
-
Context shifting is handled automatically by For example, if you have a context size of 100K tokens and all of it gets filled with chat history, I have tests for this implementation that seem to pass, but just to make sure, I've tested it again manually and it appears to work as expected. |
Beta Was this translation helpful? Give feedback.
-
Maybe there's an issue with small context sizes then. If you use a context of 1024 and try and pass in around 1200 tokens worth of context, it'll hang for, as I mentioned, an indefinite amount of time. I've waited as long as 30 minutes to see if it was just slow. I tested with different batch sizes to see if that was the issue but didn't have any luck there. Is there some reason for there to be a minimum context size for this to work? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
From this page in the documentation I get the impression that chat history is automatically truncated to fit the contextSize when using LlamaChat or LlamaChatSession. However when I try to add more messages than fit in the current contextSize, it seems to delay indefinitely. I'm not sure if that's a bug or if the user is meant to handle truncating chat messages to not exceed the contextSize.
Beta Was this translation helpful? Give feedback.
All reactions