WIP: Bedrock AgentCore memory exploration #559
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR is an exploration of the Bedrock AgentCore Memory API, and showcases 2 approaches to storing, listing and retrieving memories with the agentcore API. I have added 3 new memory tools
store_messages
,list_messages
, andsearch_memories
, that wrap the memory client provided bybedrock-agentcore
and provides a simple interface to store conversation messages, these tools handle the conversion from LangChain messages to agentcore memory events, and vice-versa. See the included sample notebooks for reference implementation, I am using this with the prebuilt react agents that LangGraph provides, but the concept can be extended to work with any LangGraph graph.Here are the key findings:
Approach 1: Using memory as tools
This approach resulted in a fairly non-deterministic workflow, where I noticed in-consistent results with each execution, where LLM would skip storing conversations or retrieving events and memories. In the attached notebook, the agent only saved the initial conversation, but skipped calling the tool in subsequent turns or retrieving memories. I have not spent a lot of time tweaking the tool/agent instructions or tested with a different model, which could improve results, so curious to learn if community has better experience with this approach.
Approach 2: Using pre/post-model-hooks
This approach provided a better output, and is much easier to control the messages stored and retrieved from the memory API. While I am using the prebuilt react agents, and so pre/post-model-hooks, you could replicate the process in a custom graph where the additional nodes (or a prompt runnable) can handle this.
FAQs
Why not implement the Checkpointer interface for short term memory?
Memory API is geared towards storing conversation turns as text, and is not suitable for storing the full graph state that checkpointer requires. To clarify further, the payload type only allows a text and role, this is inherent to the way extraction of long term memories (summarization, semantic etc.) currently work within the API. The memory API is also missing filtering options for the list_events API, which is a key requirement for the checkpointer list interface. While we could still iterate through all events and do filtering in the checkpointer implementation, this will end up severely affecting the graph performance as the list API is called several times within a graph invocation.
Why not implement the BaseStore interface for long term memory?
While the BaseStore provides the abstraction to store long term memories, I found that this will provide a very low utility if used with the memory API, and users will find it much easier to use the tools I provide in this PR or memory client directly. One thing to note is that the storage of long term memories is not automatic in LangGraph, this is something the application has to handle explicitly or use the utilities (SummarizationNode, Memory Manager and Memory Tools) that LangGraph/LangMem provide and work in conjunction with the store. There are 2 disconnects with the memory API here: 1) It requires the conversation turns to be saved (as events), we cannot use
put
for storing conversation turns, so we have to skip implementation forput
API in BaseStore, 2) The actual extraction happens in the service back-end, while the utilities I listed earlier have their own LLM layer that manage extraction and use the store to save. This means that you cannot use any existing utilities with agentcore memory.Related to aws/bedrock-agentcore-sdk-python#26