-
Notifications
You must be signed in to change notification settings - Fork 243
Summarizing for Multi-Turn Conversations #818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -7,6 +7,7 @@ | |||||||||||||
|
|
||||||||||||||
| import asyncio | ||||||||||||||
| import copy | ||||||||||||||
| import re | ||||||||||||||
| from uuid import uuid4 | ||||||||||||||
| import skyrl_gym | ||||||||||||||
| from typing import List, Dict, Any, Optional, Union, Tuple | ||||||||||||||
|
|
@@ -47,7 +48,6 @@ class StepWiseOutput: | |||||||||||||
|
|
||||||||||||||
| step_outputs: List[TrajectoryOutput] | ||||||||||||||
|
|
||||||||||||||
|
|
||||||||||||||
| @dataclass | ||||||||||||||
| class AgentLoopState: | ||||||||||||||
| chat_history: ConversationType | ||||||||||||||
|
|
@@ -57,6 +57,32 @@ class AgentLoopState: | |||||||||||||
| response_end_idx: Optional[int] | ||||||||||||||
| done: bool | ||||||||||||||
|
|
||||||||||||||
| def summarize_chat_history(self, initial_chat_history_length: int) -> ConversationType: | ||||||||||||||
| """ | ||||||||||||||
| Summarize the chat history. | ||||||||||||||
| """ | ||||||||||||||
| summary_prompt = """ | ||||||||||||||
| Your operational context is full. Generate a concise summary by populating the template below. | ||||||||||||||
| This summary will be your sole context for continuing this task. Be brief but ensure all critical data is present. | ||||||||||||||
| - Mission Objective | ||||||||||||||
| – Original query: [State the user's verbatim query.] | ||||||||||||||
| – Verification checklist: [Status (VERIFIED/PENDING)] [Checklist item] | ||||||||||||||
| - Key Findings | ||||||||||||||
| – Sources: [List the most critical, verified facts with sources.] | ||||||||||||||
| – Discrepancies: [Note any conflicting information found between sources.] | ||||||||||||||
| - Tactical Plan | ||||||||||||||
| - Promising leads: [List the best remaining keywords, sources, or angles to investigate.] | ||||||||||||||
| – Known dead ends: [List queries or sources that proved useless to avoid repetition.] | ||||||||||||||
| – Immediate next action: [State the exact tool call or query you were about to execute next.] | ||||||||||||||
| Now generate the summary, and put your summary inside tag <summary></summary>. | ||||||||||||||
| """ | ||||||||||||||
|
|
||||||||||||||
| history_to_summarize = self.chat_history[initial_chat_history_length:] | ||||||||||||||
| summarize_request = self.chat_history[:initial_chat_history_length].copy() | ||||||||||||||
| summarize_request.extend(history_to_summarize) | ||||||||||||||
|
||||||||||||||
| history_to_summarize = self.chat_history[initial_chat_history_length:] | |
| summarize_request = self.chat_history[:initial_chat_history_length].copy() | |
| summarize_request.extend(history_to_summarize) | |
| summarize_request = self.chat_history.copy() |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the language model does not return the summary within <summary> tags, re.search will return None, and the original, unparsed summary_text will be used. This could lead to a malformed context for the next turn. It's safer to handle this case, for instance, by logging a warning.
| if match: | |
| summary_text = match.group(1).strip() | |
| if match: | |
| summary_text = match.group(1).strip() | |
| else: | |
| logger.warning("Could not find <summary> tags in the summarization response. Using the full response as summary.") |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The summarization threshold is hardcoded as 80% of max_input_length. This magic number makes the code harder to maintain. It's better to make it a configurable parameter with a default value.
| threshold = int(max_input_length * 0.8) | |
| threshold = int(max_input_length * self.generator_cfg.get("summarization_threshold_ratio", 0.8)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This large summary prompt is hardcoded within the
summarize_chat_historymethod. For better readability and maintainability, consider moving it to a module-level constant (e.g.,_SUMMARY_PROMPT) at the top of the file.