-
Notifications
You must be signed in to change notification settings - Fork 147
Description
Description
The RunningSummary.summary field is typed str but can contain Union[str, list[Union[str, dict]]] (as assigned from summary_response.content: BaseMessage.content), but the code in _prepare_summarization_result assumes it's always a string when performing in operations, causing a TypeError.
Error Details
Error Message:
TypeError: 'in <string>' requires string as left operand, not list
Location:
langmem/short_term/summarization.py, line 306 in _prepare_summarization_result
Failing Code:
and existing_summary.summary
in preprocessed_messages.existing_system_message.contentRoot Cause
RunningSummary.summaryis set fromsummary_response.content(line 644 insummarization.py)BaseMessage.contentis typed asUnion[str, list[Union[str, dict]]](fromlangchain_core.messages.base)- Some language models (e.g., Google Gemini) return content as lists instead of strings
- The comparison operation at line 306 assumes
existing_summary.summaryis a string
Reproduction
from langchain_core.messages import SystemMessage, HumanMessage
from langmem.short_term.summarization import asummarize_messages, RunningSummary
from langchain.chat_models import init_chat_model
# Use a model that returns list-based content (e.g., Gemini)
llm = init_chat_model("gemini-2.5-flash", location="global", include_thoughts=True)
messages = [
SystemMessage(content="You are a helpful assistant."),
HumanMessage(content="Hello"),
# ... more messages to trigger summarization
]
# Initial summarization works
result = await asummarize_messages(
messages,
model=llm,
max_tokens=10000,
max_tokens_before_summary=8000,
)
# On subsequent call with the running_summary, if the model returned
# list-based content, this will fail
result2 = await asummarize_messages(
new_messages,
running_summary=result.running_summary, # summary field is a list
model=llm,
max_tokens=10000,
max_tokens_before_summary=8000,
)
# TypeError: 'in <string>' requires string as left operand, not listStack Trace
File "langmem/short_term/summarization.py", line 651, in asummarize_messages
return _prepare_summarization_result(
preprocessed_messages=preprocessed_messages,
existing_summary=running_summary,
summary_response=summary_response,
final_prompt=final_prompt,
)
File "langmem/short_term/summarization.py", line 306, in _prepare_summarization_result
and existing_summary.summary
^^^^^^^^^^^^^^^^^^^^^^^^
in preprocessed_messages.existing_system_message.content
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'in <string>' requires string as left operand, not list
Proposed Solution
Option 1: Normalize content in RunningSummary
Add a helper function to normalize content to string when creating RunningSummary:
def _normalize_content_to_string(content: str | list[str | dict]) -> str:
"""Normalize message content to string format."""
if isinstance(content, str):
return content
if isinstance(content, list):
text_parts = []
for item in content:
if isinstance(item, str):
text_parts.append(item)
elif isinstance(item, dict) and item.get("type") == "text" and "text" in item:
text_parts.append(str(item["text"]))
return "".join(text_parts)
return str(content)
# Update line 643-649 in summarization.py
running_summary = RunningSummary(
summary=_normalize_content_to_string(summary_response.content), # <-- normalize here
summarized_message_ids=summarized_message_ids,
last_summarized_message_id=preprocessed_messages.messages_to_summarize[-1].id,
)Option 2: Fix the comparison at line 306
Handle both string and list types in the comparison:
# Before (line 305-307)
and existing_summary.summary
in preprocessed_messages.existing_system_message.content
# After
and _normalize_content_to_string(existing_summary.summary)
in _normalize_content_to_string(preprocessed_messages.existing_system_message.content)Option 3: Strictly type RunningSummary.summary as str
Update the RunningSummary dataclass to explicitly type summary as str and enforce normalization at creation time.
Impact
This bug affects any usage of asummarize_messages with:
- Language models that return structured content (Gemini, Claude with tool use, etc.)
- Multiple summarization rounds (when
running_summaryis passed from previous calls)
Environment
- langmem: 0.0.30
- langchain: 0.3.27
- langchain-google-vertexai: 2.1.2
- Python version: 3.13
- Affected models: Google Gemini, potentially others that return list-based content
Workaround
Users can normalize the running_summary before passing it back to asummarize_messages:
def normalize_running_summary(running_summary: RunningSummary | None) -> RunningSummary | None:
if running_summary is None:
return None
# Normalize to string
if isinstance(running_summary.summary, str):
normalized_summary = running_summary.summary
elif isinstance(running_summary.summary, list):
text_parts = []
for item in running_summary.summary:
if isinstance(item, str):
text_parts.append(item)
elif isinstance(item, dict) and item.get("type") == "text":
text_parts.append(str(item["text"]))
normalized_summary = "".join(text_parts)
else:
normalized_summary = str(running_summary.summary)
return RunningSummary(
summary=normalized_summary,
summarized_message_ids=running_summary.summarized_message_ids,
last_summarized_message_id=running_summary.last_summarized_message_id,
)
# Use the normalized version
normalized_summary = normalize_running_summary(result.running_summary)
result2 = await asummarize_messages(
new_messages,
running_summary=normalized_summary,
...
)Additional Notes
A consistent normalization strategy throughout the langmem library would prevent similar issues in other operations.