Replies: 1 comment
-
🤖 Hello, The "context_length_exceeded" error occurs when the total number of tokens in the conversation history exceeds the maximum limit. In your case, you've set the "max_tokens_limit" to 8000, but if the conversation history still exceeds this limit, you'll encounter this error. To prevent this error, you can modify the "_get_chat_history" function to limit the number of dialogue turns it includes in the chat history. This can be done by only including the most recent dialogue turns that fit within the "max_tokens_limit". In the LangChain framework, the ConversationTokenBufferMemory class provides a method called "save_context" that prunes the conversation history if it exceeds the max token limit. This method removes the oldest messages from the conversation history until the total number of tokens is within the max token limit. Here's how it works: def save_context(self, inputs: Dict[str, Any], outputs: Dict[str, str]) -> None:
"""Save context from this conversation to buffer. Pruned."""
super().save_context(inputs, outputs)
# Prune buffer if it exceeds max token limit
buffer = self.chat_memory.messages
curr_buffer_length = self.llm.get_num_tokens_from_messages(buffer)
if curr_buffer_length > self.max_token_limit:
pruned_memory = []
while curr_buffer_length > self.max_token_limit:
pruned_memory.append(buffer.pop(0))
curr_buffer_length = self.llm.get_num_tokens_from_messages(buffer) This method is automatically called after each dialogue turn, so you don't need to call it manually. If you're still encountering the "context_length_exceeded" error, you may need to reduce the max token limit or prune the conversation history more aggressively. Regarding the "reduce_k_below_max_tokens" parameter, it is a boolean flag that determines whether the number of results returned from the store should be reduced based on a token limit. If "reduce_k_below_max_tokens" is set to True and the "combine_documents_chain" is an instance of StuffDocumentsChain, the "_reduce_tokens_below_limit" method will calculate the total number of tokens in the documents returned by the retriever. If the total number of tokens exceeds the "max_tokens_limit", it will reduce the number of documents until the total number of tokens is below the limit. I hope this helps! If you have any other questions, feel free to ask. Best, Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, i have a problem, after a few messages with my chat i have an errot:
error_code=context_length_exceeded error_message=“This model’s maximum context length is 8192 tokens. However, your messages resulted in 9066 tokens. Please reduce the length of the messages.” error_param=messages error_type=invalid_request_error message=‘OpenAI API error received’ stream_error=False
my main Chain looks like this:
but reduce_k_below_max_tokens=True is not helping, also I’ve tried to use later in the code chain.max_tokens_limit = 8000, bu this also not working. Im using ChainLit to make a chat.
How can I prevent my chat from getting this error?
I konw prompt can be too long and thats one problem. But how when I have 4/5 questions and answers then model generates this problem about context. Can I clear from context oldest messages ? No from prompt but from this conversation I’m currently having?
Beta Was this translation helpful? Give feedback.
All reactions