Bug in summarizing parallel tool calls.

`langmem` has a bug when it's choosing the messages to summarize, when there are multiple tool calls involved.

# Code to reproduce

```python
import random
import string

from langchain_core.messages import AIMessage, HumanMessage, ToolCall, ToolMessage
from langchain_openai import ChatOpenAI
from langmem.short_term import SummarizationNode


def rand_string(len):
    """Generate a random string of fixed length."""

    alphabet = string.ascii_letters + string.digits
    return "".join(random.choices(alphabet, k=len))

# messages
messages = [
    HumanMessage(content="Generate two long random strings", id="id1"),
    AIMessage(
        content=f"Utilizing `rand_string` tool for this",
        id="id2",
        tool_calls=[
            ToolCall(name="rand_string", args={"len": 5000}, id="1"),
            ToolCall(name="rand_string", args={"len": 200}, id="2"),
        ],
    ),
    ToolMessage(
        content=rand_string(5000),
        tool_call_id="1",
        id="id3",
    ),  # <--- summary is triggered here
    ToolMessage(content=rand_string(200), tool_call_id="2", id="id4"),
    AIMessage(content=f"Generated the two random strings.", id="id5"),
    HumanMessage(content="Can you explain the generation algorithm?", id="id6"),
]

# summarize the list with the summarization node

summarization_model = ChatOpenAI("gpt-5-mini")  # Adjust your model here
summarization_node = SummarizationNode(
    model=summarization_model,
    max_tokens=384,
    max_summary_tokens=128,
    output_messages_key="llm_input_messages",
)
summary = summarization_node.invoke({"messages": messages})

print(summary["context"])
```

# Environment

I am using Python 3.12.7, and here are my `langchain` packages:
```
langchain==0.3.27
langchain-core==0.3.78
langchain-openai==0.3.32
langgraph==0.6.6
langgraph-checkpoint==2.1.1
langgraph-prebuilt==0.6.4
langgraph-sdk==0.2.6
langmem==0.0.29
langsmith==0.4.25
openai==1.106.1
```

# Error Traceback

```
/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langmem/short_term/summarization.py:246: RuntimeWarning: Failed to trim messages to fit within max_tokens limit before summarization - falling back to the original message list. This may lead to exceeding the context window of the summarization LLM.
  warnings.warn(
Traceback (most recent call last):
  File "/Users/user/langmem_bug/bug.py", line 45, in <module>
    summ = summarization_node.invoke({"messages": messages})
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langgraph/_internal/_runnable.py", line 401, in invoke
    ret = self.func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langmem/short_term/summarization.py", line 832, in _func
    summarization_result = summarize_messages(
                           ^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langmem/short_term/summarization.py", line 478, in summarize_messages
    summary_response = model.invoke(summary_messages)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 395, in invoke
    self.generate_prompt(
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 1025, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 842, in generate
    self._generate_with_cache(
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 1091, in _generate_with_cache
    result = self._generate(
             ^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langchain_openai/chat_models/base.py", line 1183, in _generate
    raise e
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/langchain_openai/chat_models/base.py", line 1178, in _generate
    raw_response = self.client.with_raw_response.create(**payload)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/openai/_legacy_response.py", line 364, in wrapped
    return cast(LegacyAPIResponse[R], func(*args, **kwargs))
                                      ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/.pyenv/versions/3.12.7/lib/python3.12/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: 2", 'type': 'invalid_request_error', 'param': 'messages.[3].role', 'code': None}}
```

# Potential Explanation

The summarization is triggered at message with `id3` (the first tool call), then the replaced summarized messages ends up being like this below, leaving Tool Message `id4` without the corresponding AI Message. 

```python
messages = [
    SystemMessage(content=...),
    ToolMessage(content=rand_string(200), tool_call_id="2", id="id4"),
    AIMessage(content=f"Generated the two random strings.", id="id5"),
    HumanMessage(content="Can you explain the generation algorithm?", id="id6"),
]
```

In this [this line](https://github.com/langchain-ai/langmem/blob/main/src/langmem/short_term/summarization.py#L206) , the check for the ai_message as the last message is not fully correct, as the last message could be other parallel tool calls that are not properly handled.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in summarizing parallel tool calls. #126

Code to reproduce

Environment

Error Traceback

Potential Explanation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug in summarizing parallel tool calls. #126

Description

Code to reproduce

Environment

Error Traceback

Potential Explanation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions