Skip to content

Commit 66ee346

Browse files
authored
Merge pull request anthropics#154 from anthropics/alexander/memory-cookbook-suggestions
Memory cookbook suggestions
2 parents fb261fd + b98b8e7 commit 66ee346

File tree

1 file changed

+62
-69
lines changed

1 file changed

+62
-69
lines changed

tool_use/memory_cookbook.ipynb

Lines changed: 62 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -29,12 +29,18 @@
2929
"source": [
3030
"### Introduction\n",
3131
"\n",
32-
"Managing memory effectively is a critical part of building agents and agentic workflows that handle long-horizon tasks. In this cookbook we're going to demonstrate a few different strategies for \"self-managed\" (llm-managed) memory. Use this notebook as a starting point for your own memory implementations. We do not expect that memory tools are one-size-fits-all, and further believe that different domains/tasks necessarily lend themselves to more or less rigid memory scaffolding. The Claude 4 model family has proven to be particularly strong at utilizing memory tooling, and we're excited to see how teams extend the ideas below.\n",
32+
"Managing memory effectively is a critical part of building agents and agentic workflows that handle long-horizon tasks. In this cookbook we demonstrate a few different strategies for \"self-managed\" (LLM-managed) memory. Use this notebook as a starting point for your own memory implementations. We do not expect that memory tools are one-size-fits-all, and further believe that different domains/tasks necessarily lend themselves to more or less rigid memory scaffolding. The Claude 4 model family has proven to be particularly strong at utilizing [memory tooling](https://www.anthropic.com/news/claude-4#:~:text=more%20on%20methodology.-,Model%20improvements,-In%20addition%20to), and we're excited to see how teams extend the ideas below.\n",
3333
"\n",
3434
"\n",
3535
"#### Why do we need to manage memory?\n",
3636
"\n",
37-
"LLMs have finite context windows (200k tokens for Claude-4 Sonnet & Opus). Tactically this means that any request > 200k tokens will be truncated. As many teams building with LLMs quickly learn, there is additional complexity in identifying and working within the *effective* context window of an LLM. Often, in practice, most tasks see performance degregation at thresholds significantly less that the maximum available context window. Successfully building LLM-based systems is an exercise in discarding the unnecessary tokens and efficiently storing + retrieving the relevant tokens for the task at hand."
37+
"LLMs have finite context windows (200k tokens for Claude 4 Sonnet & Opus). This means that for any request, if the sum of prompt tokens and output tokens exceeds the model’s context window, the system will return a validation error. As many teams building with LLMs quickly learn, there is additional complexity in identifying and working within the *effective* [context window](https://docs.anthropic.com/en/docs/build-with-claude/context-windows) of an LLM. See our tips for [long context prompting](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips) to learn more about effective context windows and best practices.\n",
38+
"\n",
39+
"In addition to the above, memory is important for the following reasons:\n",
40+
"- **Long context windows are computationally expensive:** Attention mechanisms scale quadratically—doubling context length quadruples compute cost. Most tasks only need a small fraction of available context, making it wasteful to process millions of irrelevant tokens. This is why humans don't memorize entire textbooks; we take notes and build mental models instead.\n",
41+
"- **More efficient processing:** When LLMs write and maintain their own notes—saving successful strategies, key insights, and relevant context—they're effectively updating their capabilities in real-time without retraining. Models that excel at these operations can maintain coherent behavior over extremely long time horizons while using only a fraction of the computational resources required for full context windows.\n",
42+
"\n",
43+
"Successfully building LLM-based systems is an exercise in discarding the unnecessary tokens and efficiently storing + retrieving the relevant tokens for the task at-hand."
3844
]
3945
},
4046
{
@@ -46,25 +52,17 @@
4652
},
4753
{
4854
"cell_type": "code",
49-
"execution_count": 1,
55+
"execution_count": null,
5056
"metadata": {},
51-
"outputs": [
52-
{
53-
"name": "stdout",
54-
"output_type": "stream",
55-
"text": [
56-
"Note: you may need to restart the kernel to use updated packages.\n"
57-
]
58-
}
59-
],
57+
"outputs": [],
6058
"source": [
6159
"# install deps\n",
6260
"%pip install -q -U anthropic python-dotenv nest_asyncio PyPDF2"
6361
]
6462
},
6563
{
6664
"cell_type": "code",
67-
"execution_count": 7,
65+
"execution_count": 36,
6866
"metadata": {},
6967
"outputs": [],
7068
"source": [
@@ -92,27 +90,37 @@
9290
},
9391
{
9492
"cell_type": "code",
95-
"execution_count": 5,
93+
"execution_count": 37,
9694
"metadata": {},
9795
"outputs": [
9896
{
9997
"name": "stdout",
10098
"output_type": "stream",
10199
"text": [
102-
"fatal: destination path '/tmp/anthropic-quickstarts' already exists and is not an empty directory.\n"
100+
"Repository already exists at /tmp/anthropic-quickstarts\n"
103101
]
104102
}
105103
],
106104
"source": [
107105
"import sys \n",
106+
"import os\n",
108107
"\n",
109-
"# clone the agents quickstart implementation\n",
110-
"!git clone https://github.com/anthropics/anthropic-quickstarts.git /tmp/anthropic-quickstarts\n",
111-
"\n",
112-
"# navigate to the agents quickstart implementation\n",
113-
"!cd /tmp/anthropic-quickstarts\n",
114-
"\n",
115-
"sys.path.append(os.path.abspath('.'))"
108+
"# Check if the repo already exists\n",
109+
"if not os.path.exists('/tmp/anthropic-quickstarts'):\n",
110+
" # Clone the agents quickstart implementation\n",
111+
" !git clone https://github.com/anthropics/anthropic-quickstarts.git /tmp/anthropic-quickstarts\n",
112+
"else:\n",
113+
" print(\"Repository already exists at /tmp/anthropic-quickstarts\")\n",
114+
"\n",
115+
"# IMPORTANT: Insert at the beginning of sys.path to override any existing 'agents' modules\n",
116+
"if '/tmp/anthropic-quickstarts' not in sys.path:\n",
117+
" sys.path.insert(0, '/tmp/anthropic-quickstarts')\n",
118+
"\n",
119+
"# Clear any cached imports of 'agents' module\n",
120+
"if 'agents' in sys.modules:\n",
121+
" del sys.modules['agents']\n",
122+
"if 'agents.agent' in sys.modules:\n",
123+
" del sys.modules['agents.agent']"
116124
]
117125
},
118126
{
@@ -124,14 +132,14 @@
124132
},
125133
{
126134
"cell_type": "code",
127-
"execution_count": 3,
135+
"execution_count": 38,
128136
"metadata": {},
129137
"outputs": [
130138
{
131139
"name": "stdout",
132140
"output_type": "stream",
133141
"text": [
134-
"Oh joy, another laptop problem. What's it doing? Blue-screening? Making strange noises? Becoming self-aware? I need details before I can wave my magical tech support wand.\n"
142+
"*eye roll* Another laptop crisis. What's it doing? Singing off-key? Refusing to work unless you feed it cookies? Details, please.\n"
135143
]
136144
}
137145
],
@@ -156,15 +164,16 @@
156164
"source": [
157165
"### Implementation 1: Simple Memory Tool\n",
158166
"\n",
159-
"*Implementation borrowed from [Barry Zhang](https://github.com/ItsBarryZ)*. See the agents quick-start tools [here](https://github.com/anthropics/anthropic-quickstarts/tree/main/agents/tools) as well as the Anthropic API tools [docs](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview).\n",
167+
"*This implementation is a reflection of our agents quickstarts repo [here](https://github.com/anthropics/anthropic-quickstarts/tree/main/agents/tools). For more information on tool use, see the Anthropic API tools [docs](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview).*\n",
160168
"\n",
161169
"The `SimpleMemory()` tool gives the model a scratchpad to manage memory. This is maintained as a single string that can be read or updated.\n",
162170
"\n",
163171
"Here we've defined the `read`, `write`, and `edit` actions. Explicitly defining `read` means the model won't have access to the full contents of memory at every turn. We recommend that if you follow this pattern you introduce a separate, shortened summary or metadata object describing the contents of memory and include that in every request (ideally preventing excessive reads).\n",
164172
"\n",
165173
"\n",
166174
"<b>When would you use this?</b>\n",
167-
"- You want to quickly spin up a memory experiment or augment an existing long-context task. Start here if you don't have high conviction around the types of items that need to be stored or if the agent must support many interaction types.\n",
175+
"\n",
176+
"You want to quickly spin up a memory experiment or augment an existing long-context task. Start here if you don't have high conviction around the types of items that need to be stored or if the agent must support many interaction types.\n",
168177
"\n",
169178
"<b><i>General Notes on Tool Use:</i></b> \n",
170179
"- Your tool descriptions should be clear and sufficiently detailed. The best way to guide model behavior around tools is by providing direction as to when / under what conditions tools should be used. \n",
@@ -173,7 +182,7 @@
173182
},
174183
{
175184
"cell_type": "code",
176-
"execution_count": 4,
185+
"execution_count": 39,
177186
"metadata": {},
178187
"outputs": [],
179188
"source": [
@@ -296,7 +305,7 @@
296305
},
297306
{
298307
"cell_type": "code",
299-
"execution_count": null,
308+
"execution_count": 40,
300309
"metadata": {},
301310
"outputs": [],
302311
"source": [
@@ -371,7 +380,7 @@
371380
},
372381
{
373382
"cell_type": "code",
374-
"execution_count": null,
383+
"execution_count": 41,
375384
"metadata": {},
376385
"outputs": [],
377386
"source": [
@@ -551,22 +560,29 @@
551560
},
552561
{
553562
"cell_type": "code",
554-
"execution_count": 57,
563+
"execution_count": 43,
555564
"metadata": {},
556565
"outputs": [
557566
{
558567
"data": {
559568
"text/plain": [
560-
"{'type': 'file',\n",
561-
" 'id': 'file_011CPN5QewZbKuHeB8gL1Fwr',\n",
562-
" 'size_bytes': 32378962,\n",
563-
" 'created_at': '2025-05-22T06:14:19.943000Z',\n",
564-
" 'filename': 'SB1029-ProjectUpdate-FINAL_020317-A11Y.pdf',\n",
565-
" 'mime_type': 'application/pdf',\n",
566-
" 'downloadable': False}"
569+
"[{'type': 'file',\n",
570+
" 'id': 'file_011CPaGpXxdBojQLTszA5LGp',\n",
571+
" 'size_bytes': 544347,\n",
572+
" 'created_at': '2025-05-28T16:51:06.716000Z',\n",
573+
" 'filename': 'sample.pdf',\n",
574+
" 'mime_type': 'application/pdf',\n",
575+
" 'downloadable': False},\n",
576+
" {'type': 'file',\n",
577+
" 'id': 'file_011CPYNG2Sf1cWjuCFhKJFV7',\n",
578+
" 'size_bytes': 3,\n",
579+
" 'created_at': '2025-05-27T16:41:15.335000Z',\n",
580+
" 'filename': 'number.txt',\n",
581+
" 'mime_type': 'text/plain',\n",
582+
" 'downloadable': True}]"
567583
]
568584
},
569-
"execution_count": 57,
585+
"execution_count": 43,
570586
"metadata": {},
571587
"output_type": "execute_result"
572588
}
@@ -645,10 +661,11 @@
645661
" raise ValueError(f\"Failed to upload file: {res.status_code} - {res.text}\")\n",
646662
" \n",
647663
"# example usage\n",
648-
"file_path = \"/Users/user/Downloads/SB1029-ProjectUpdate-FINAL_020317-A11Y.pdf\" # REPLACE\n",
664+
"#file_path = \"/Users/user/Downloads/SB1029-ProjectUpdate-FINAL_020317-A11Y.pdf\" # REPLACE\n",
649665
"storage_manager = StorageManager(os.getenv(\"ANTHROPIC_API_KEY\"))\n",
650-
"uploaded = storage_manager.upload_file(file_path)\n",
651-
"storage_manager.get_file_metadata(uploaded['id'])"
666+
"#uploaded = storage_manager.upload_file(file_path)\n",
667+
"#storage_manager.get_file_metadata(uploaded['id'])\n",
668+
"storage_manager.list_files()[:2]"
652669
]
653670
},
654671
{
@@ -697,7 +714,7 @@
697714
},
698715
{
699716
"cell_type": "code",
700-
"execution_count": 55,
717+
"execution_count": 44,
701718
"metadata": {},
702719
"outputs": [
703720
{
@@ -714,7 +731,7 @@
714731
" └── projects"
715732
]
716733
},
717-
"execution_count": 55,
734+
"execution_count": 44,
718735
"metadata": {},
719736
"output_type": "execute_result"
720737
}
@@ -743,7 +760,7 @@
743760
},
744761
{
745762
"cell_type": "code",
746-
"execution_count": null,
763+
"execution_count": 45,
747764
"metadata": {},
748765
"outputs": [],
749766
"source": [
@@ -1034,31 +1051,7 @@
10341051
"cell_type": "code",
10351052
"execution_count": null,
10361053
"metadata": {},
1037-
"outputs": [
1038-
{
1039-
"name": "stderr",
1040-
"output_type": "stream",
1041-
"text": [
1042-
"/var/folders/40/m42jqbt54j90clf75tsn03kw0000gp/T/ipykernel_92531/3353802839.py:99: DeprecationWarning: on_submit is deprecated. Instead, set the .continuous_update attribute to False and observe the value changing with: mywidget.observe(callback, 'value').\n",
1043-
" self.text_input.on_submit(self.on_send)\n"
1044-
]
1045-
},
1046-
{
1047-
"data": {
1048-
"application/vnd.jupyter.widget-view+json": {
1049-
"model_id": "92bc4784ef0c462d9b737c14c040f508",
1050-
"version_major": 2,
1051-
"version_minor": 0
1052-
},
1053-
"text/plain": [
1054-
"HBox(children=(VBox(children=(Label(value='Chat'), Output(layout=Layout(border_bottom='1px solid #ccc', border…"
1055-
]
1056-
},
1057-
"execution_count": 77,
1058-
"metadata": {},
1059-
"output_type": "execute_result"
1060-
}
1061-
],
1054+
"outputs": [],
10621055
"source": [
10631056
"memory_tool = FileBasedMemoryTool() # or SimpleMemory() or CompactifyMemory(client) or FileBasedMemoryTool(storage_manager)\n",
10641057
"model_config = {\n",

0 commit comments

Comments
 (0)