What is summarization / is it working? #2183
Replies: 2 comments 2 replies
-
We are using an adaptation on the "ConversationSummaryBufferMemory" strategy to summarize messages. To learn more about this, see this article: https://www.pinecone.io/learn/series/langchain/langchain-conversational-memory/ To summarize (lol), the summarization is triggered when the following conditions are met:
This worked well in the age of models with 4-8k context, when this was first implemented, operating within the "efficient" realm as shown in the article. However, this needs to be revisited soon as we are now in the age of ever-increasing context windows (gpt-4-turbo with 128k and anthropic 200k+). That means that we need to get to around 60-100k tokens for summarization to kick in. While this may alleviate costs from using the full context, it's sub-optimal. I would also like to add an option for the user, through the config file, to decide what the summary context window should be, first on an endpoint-level then even on a model level. |
Beta Was this translation helpful? Give feedback.
-
Claude Code has a feature where you can call It's a bit rough (when it's running rampant through files you need to run it often or you'll be paying dollars a minute) but it could be a useful feature here. For example, you could spawn a thread based on the current chat up to this point, but the starting message is a summary from the assistant of everything that occurred up to this point, so the user can continue with their next request, etc. Probably in practice you'd want to be able to customise the summarise prompt per agent (or let the user provide instructions at the point it's happening) so it knows what to focus on (e.g. how long should the summary be? if we're summarising a second time, how much should it care about the content of the last summary versus what's happened since?) but having compact/summarise as a thread option might be a better UX than trying to mysteriously summarise in the background? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I am using the Azure OpenAI models and have enabled summarization on the endpoint-level. I am not sure, however, where exactly something is being summarized, now that I enabled it. At first I thought this refers to automatic generation of chat titles, but they are all called
New Chat
for me. Where can I find the summarization feature and how can I tell whether it works?Beta Was this translation helpful? Give feedback.
All reactions