-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Problem Statement
When enabling enable_auto_context_summarization in LLMAssistantAggregatorParams, the current implementation excludes only the first message (assumed to be the system message) from the summarization process.
However, in real-world applications there are often other messages during the conversation that should not be summarized. Examples include:
- Tool responses
- Dynamically injected system prompts
- Important contextual data added later in the conversation
Currently, these messages may be summarized along with the rest of the conversation, which can lead to loss of critical structured information needed for future turns.
This limitation becomes more complex when working with providers like Google or Anthropic, where the message role system or tool abstraction differs from OpenAI. In these cases, determining which messages should or should not be summarized may require provider-specific logic.
Proposed Solution
Introduce a flag that allows explicit control over whether a message should be included in context summarization.
1. Message-Level Control
Extend messages in LLMContext to include a flag:
LLMContext(messages=messages)Each message could include:
{
"role": "system",
"content": "...",
"include_in_context_summarization": True
}Default behavior:
include_in_context_summarization = True
This would allow developers to exclude specific messages from summarization by setting:
include_in_context_summarization = False
2. Tool Result Control
Allow the same behavior for tool responses using FunctionCallResultProperties:
async def fun(params: FunctionCallParams):
await params.result_callback(
{
"message": "Message"
},
properties=FunctionCallResultProperties(
include_in_context_summarization=False
),
)This ensures certain tool outputs remain intact in the conversation context and are not collapsed into summaries.
3. Message-Level Control in Frames
For frames such as:
LLMMessagesAppendFrame(
messages=[{"role": "system", "content": template}]
)The include_in_context_summarization flag should be supported at the message level so each message can explicitly control whether it participates in summarization.
Example:
LLMMessagesAppendFrame(
messages=[
{
"role": "system",
"content": template,
"include_in_context_summarization": False,
}
]
)This keeps the control granular and explicit, allowing developers to selectively exclude specific injected messages (such as system prompts or structured data) from the summarization process while leaving other messages unaffected.
Alternative Solutions
Additional Context
A real production scenario where this becomes critical:
In a Property Management System (PMS) voice assistant for hotels:
- A guest calls the hotel.
- At the beginning of the conversation, the system does not yet know which reservation belongs to the caller.
- The assistant asks the guest for confirmation details (name, booking ID, etc.).
- After confirmation, the system retrieves the reservation and injects the guest data into the context.
Example injected data might include:
- Guest profile
- Reservation details
- Room information
- Stay dates
This information should behave like persistent system context, not normal conversation.
However, with the current summarization behavior:
- These injected messages may be included in summarization
- The structured reservation data can be lost or degraded
- The assistant may fail to answer reservation-specific questions later in the conversation
Being able to mark these messages with:
include_in_context_summarization = False
would ensure critical runtime context remains intact even after multiple summarization cycles.
Would you be willing to help implement this feature?
- Yes, I'd like to contribute
- No, I'm just suggesting