-
Notifications
You must be signed in to change notification settings - Fork 222
Summarizing for Multi-Turn Conversations #818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a mechanism to summarize multi-turn conversation histories when the context length exceeds a certain threshold. The implementation adds a new configuration flag summarize_chat, a method to construct and perform the summarization, and integrates this into the agent loop. My review focuses on improving the maintainability, robustness, and configurability of the new summarization logic. I've suggested moving a large hardcoded prompt to a constant, simplifying some redundant code, handling a potential failure case in summary extraction, and making the summarization threshold configurable. These changes should make the feature more robust and easier to manage.
| if match: | ||
| summary_text = match.group(1).strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the language model does not return the summary within <summary> tags, re.search will return None, and the original, unparsed summary_text will be used. This could lead to a malformed context for the next turn. It's safer to handle this case, for instance, by logging a warning.
| if match: | |
| summary_text = match.group(1).strip() | |
| if match: | |
| summary_text = match.group(1).strip() | |
| else: | |
| logger.warning("Could not find <summary> tags in the summarization response. Using the full response as summary.") |
| summary_prompt = """ | ||
| Your operational context is full. Generate a concise summary by populating the template below. | ||
| This summary will be your sole context for continuing this task. Be brief but ensure all critical data is present. | ||
| - Mission Objective | ||
| – Original query: [State the user's verbatim query.] | ||
| – Verification checklist: [Status (VERIFIED/PENDING)] [Checklist item] | ||
| - Key Findings | ||
| – Sources: [List the most critical, verified facts with sources.] | ||
| – Discrepancies: [Note any conflicting information found between sources.] | ||
| - Tactical Plan | ||
| - Promising leads: [List the best remaining keywords, sources, or angles to investigate.] | ||
| – Known dead ends: [List queries or sources that proved useless to avoid repetition.] | ||
| – Immediate next action: [State the exact tool call or query you were about to execute next.] | ||
| Now generate the summary, and put your summary inside tag <summary></summary>. | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| history_to_summarize = self.chat_history[initial_chat_history_length:] | ||
| summarize_request = self.chat_history[:initial_chat_history_length].copy() | ||
| summarize_request.extend(history_to_summarize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic to create summarize_request can be simplified. These three lines are equivalent to self.chat_history.copy().
| history_to_summarize = self.chat_history[initial_chat_history_length:] | |
| summarize_request = self.chat_history[:initial_chat_history_length].copy() | |
| summarize_request.extend(history_to_summarize) | |
| summarize_request = self.chat_history.copy() |
| done=False, | ||
| ) | ||
|
|
||
| threshold = int(max_input_length * 0.8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The summarization threshold is hardcoded as 80% of max_input_length. This magic number makes the code harder to maintain. It's better to make it a configurable parameter with a default value.
| threshold = int(max_input_length * 0.8) | |
| threshold = int(max_input_length * self.generator_cfg.get("summarization_threshold_ratio", 0.8)) |
No description provided.