Skip to content

Bug in summarizing parallel tool calls. #126

@mostafa-amin-dt

Description

langmem has a bug when it's choosing the messages to summarize, when there are multiple tool calls involved.

Code to reproduce

import random
import string

from langchain_core.messages import AIMessage, HumanMessage, ToolCall, ToolMessage
from langchain_openai import ChatOpenAI
from langmem.short_term import SummarizationNode


def rand_string(len):
    """Generate a random string of fixed length."""

    alphabet = string.ascii_letters + string.digits
    return "".join(random.choices(alphabet, k=len))

# messages
messages = [
    HumanMessage(content="Generate two long random strings", id="id1"),
    AIMessage(
        content=f"Utilizing `rand_string` tool for this",
        id="id2",
        tool_calls=[
            ToolCall(name="rand_string", args={"len": 5000}, id="1"),
            ToolCall(name="rand_string", args={"len": 200}, id="2"),
        ],
    ),
    ToolMessage(
        content=rand_string(5000),
        tool_call_id="1",
        id="id3",
    ),  # <--- summary is triggered here
    ToolMessage(content=rand_string(200), tool_call_id="2", id="id4"),
    AIMessage(content=f"Generated the two random strings.", id="id5"),
    HumanMessage(content="Can you explain the generation algorithm?", id="id6"),
]

# summarize the list with the summarization node

summarization_model = ChatOpenAI("gpt-5-mini")  # Adjust your model here
summarization_node = SummarizationNode(
    model=summarization_model,
    max_tokens=384,
    max_summary_tokens=128,
    output_messages_key="llm_input_messages",
)
summary = summarization_node.invoke({"messages": messages})

print(summary["context"])

Environment

I am using Python 3.12.7, and here are my langchain packages:

langchain==0.3.27
langchain-core==0.3.78
langchain-openai==0.3.32
langgraph==0.6.6
langgraph-checkpoint==2.1.1
langgraph-prebuilt==0.6.4
langgraph-sdk==0.2.6
langmem==0.0.29
langsmith==0.4.25
openai==1.106.1

Error Traceback

/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langmem/short_term/summarization.py:246: RuntimeWarning: Failed to trim messages to fit within max_tokens limit before summarization - falling back to the original message list. This may lead to exceeding the context window of the summarization LLM.
  warnings.warn(
Traceback (most recent call last):
  File "/Users/user/langmem_bug/bug.py", line 45, in <module>
    summ = summarization_node.invoke({"messages": messages})
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langgraph/_internal/_runnable.py", line 401, in invoke
    ret = self.func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langmem/short_term/summarization.py", line 832, in _func
    summarization_result = summarize_messages(
                           ^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langmem/short_term/summarization.py", line 478, in summarize_messages
    summary_response = model.invoke(summary_messages)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 395, in invoke
    self.generate_prompt(
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 1025, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 842, in generate
    self._generate_with_cache(
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 1091, in _generate_with_cache
    result = self._generate(
             ^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langchain_openai/chat_models/base.py", line 1183, in _generate
    raise e
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langchain_openai/chat_models/base.py", line 1178, in _generate
    raw_response = self.client.with_raw_response.create(**payload)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/openai/_legacy_response.py", line 364, in wrapped
    return cast(LegacyAPIResponse[R], func(*args, **kwargs))
                                      ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: 2", 'type': 'invalid_request_error', 'param': 'messages.[3].role', 'code': None}}

Potential Explanation

The summarization is triggered at message with id3 (the first tool call), then the replaced summarized messages ends up being like this below, leaving Tool Message id4 without the corresponding AI Message.

messages = [
    SystemMessage(content=...),
    ToolMessage(content=rand_string(200), tool_call_id="2", id="id4"),
    AIMessage(content=f"Generated the two random strings.", id="id5"),
    HumanMessage(content="Can you explain the generation algorithm?", id="id6"),
]

In this this line , the check for the ai_message as the last message is not fully correct, as the last message could be other parallel tool calls that are not properly handled.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions