|
29 | 29 | "source": [ |
30 | 30 | "### Introduction\n", |
31 | 31 | "\n", |
32 | | - "Managing memory effectively is a critical part of building agents and agentic workflows that handle long-horizon tasks. In this cookbook we're going to demonstrate a few different strategies for \"self-managed\" (llm-managed) memory. Use this notebook as a starting point for your own memory implementations. We do not expect that memory tools are one-size-fits-all, and further believe that different domains/tasks necessarily lend themselves to more or less rigid memory scaffolding. The Claude 4 model family has proven to be particularly strong at utilizing memory tooling, and we're excited to see how teams extend the ideas below.\n", |
| 32 | + "Managing memory effectively is a critical part of building agents and agentic workflows that handle long-horizon tasks. In this cookbook we demonstrate a few different strategies for \"self-managed\" (LLM-managed) memory. Use this notebook as a starting point for your own memory implementations. We do not expect that memory tools are one-size-fits-all, and further believe that different domains/tasks necessarily lend themselves to more or less rigid memory scaffolding. The Claude 4 model family has proven to be particularly strong at utilizing [memory tooling](https://www.anthropic.com/news/claude-4#:~:text=more%20on%20methodology.-,Model%20improvements,-In%20addition%20to), and we're excited to see how teams extend the ideas below.\n", |
33 | 33 | "\n", |
34 | 34 | "\n", |
35 | 35 | "#### Why do we need to manage memory?\n", |
36 | 36 | "\n", |
37 | | - "LLMs have finite context windows (200k tokens for Claude-4 Sonnet & Opus). Tactically this means that any request > 200k tokens will be truncated. As many teams building with LLMs quickly learn, there is additional complexity in identifying and working within the *effective* context window of an LLM. Often, in practice, most tasks see performance degregation at thresholds significantly less that the maximum available context window. Successfully building LLM-based systems is an exercise in discarding the unnecessary tokens and efficiently storing + retrieving the relevant tokens for the task at hand." |
| 37 | + "LLMs have finite context windows (200k tokens for Claude 4 Sonnet & Opus). This means that for any request, if the sum of prompt tokens and output tokens exceeds the model’s context window, the system will return a validation error. As many teams building with LLMs quickly learn, there is additional complexity in identifying and working within the *effective* [context window](https://docs.anthropic.com/en/docs/build-with-claude/context-windows) of an LLM. See our tips for [long context prompting](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips) to learn more about effective context windows and best practices.\n", |
| 38 | + "\n", |
| 39 | + "In addition to the above, memory is important for the following reasons:\n", |
| 40 | + "- **Long context windows are computationally expensive:** Attention mechanisms scale quadratically—doubling context length quadruples compute cost. Most tasks only need a small fraction of available context, making it wasteful to process millions of irrelevant tokens. This is why humans don't memorize entire textbooks; we take notes and build mental models instead.\n", |
| 41 | + "- **More efficient processing:** When LLMs write and maintain their own notes—saving successful strategies, key insights, and relevant context—they're effectively updating their capabilities in real-time without retraining. Models that excel at these operations can maintain coherent behavior over extremely long time horizons while using only a fraction of the computational resources required for full context windows.\n", |
| 42 | + "\n", |
| 43 | + "Successfully building LLM-based systems is an exercise in discarding the unnecessary tokens and efficiently storing + retrieving the relevant tokens for the task at-hand." |
38 | 44 | ] |
39 | 45 | }, |
40 | 46 | { |
|
46 | 52 | }, |
47 | 53 | { |
48 | 54 | "cell_type": "code", |
49 | | - "execution_count": 1, |
| 55 | + "execution_count": null, |
50 | 56 | "metadata": {}, |
51 | | - "outputs": [ |
52 | | - { |
53 | | - "name": "stdout", |
54 | | - "output_type": "stream", |
55 | | - "text": [ |
56 | | - "Note: you may need to restart the kernel to use updated packages.\n" |
57 | | - ] |
58 | | - } |
59 | | - ], |
| 57 | + "outputs": [], |
60 | 58 | "source": [ |
61 | 59 | "# install deps\n", |
62 | 60 | "%pip install -q -U anthropic python-dotenv nest_asyncio PyPDF2" |
63 | 61 | ] |
64 | 62 | }, |
65 | 63 | { |
66 | 64 | "cell_type": "code", |
67 | | - "execution_count": 7, |
| 65 | + "execution_count": 36, |
68 | 66 | "metadata": {}, |
69 | 67 | "outputs": [], |
70 | 68 | "source": [ |
|
92 | 90 | }, |
93 | 91 | { |
94 | 92 | "cell_type": "code", |
95 | | - "execution_count": 5, |
| 93 | + "execution_count": 37, |
96 | 94 | "metadata": {}, |
97 | 95 | "outputs": [ |
98 | 96 | { |
99 | 97 | "name": "stdout", |
100 | 98 | "output_type": "stream", |
101 | 99 | "text": [ |
102 | | - "fatal: destination path '/tmp/anthropic-quickstarts' already exists and is not an empty directory.\n" |
| 100 | + "Repository already exists at /tmp/anthropic-quickstarts\n" |
103 | 101 | ] |
104 | 102 | } |
105 | 103 | ], |
106 | 104 | "source": [ |
107 | 105 | "import sys \n", |
| 106 | + "import os\n", |
108 | 107 | "\n", |
109 | | - "# clone the agents quickstart implementation\n", |
110 | | - "!git clone https://github.com/anthropics/anthropic-quickstarts.git /tmp/anthropic-quickstarts\n", |
111 | | - "\n", |
112 | | - "# navigate to the agents quickstart implementation\n", |
113 | | - "!cd /tmp/anthropic-quickstarts\n", |
114 | | - "\n", |
115 | | - "sys.path.append(os.path.abspath('.'))" |
| 108 | + "# Check if the repo already exists\n", |
| 109 | + "if not os.path.exists('/tmp/anthropic-quickstarts'):\n", |
| 110 | + " # Clone the agents quickstart implementation\n", |
| 111 | + " !git clone https://github.com/anthropics/anthropic-quickstarts.git /tmp/anthropic-quickstarts\n", |
| 112 | + "else:\n", |
| 113 | + " print(\"Repository already exists at /tmp/anthropic-quickstarts\")\n", |
| 114 | + "\n", |
| 115 | + "# IMPORTANT: Insert at the beginning of sys.path to override any existing 'agents' modules\n", |
| 116 | + "if '/tmp/anthropic-quickstarts' not in sys.path:\n", |
| 117 | + " sys.path.insert(0, '/tmp/anthropic-quickstarts')\n", |
| 118 | + "\n", |
| 119 | + "# Clear any cached imports of 'agents' module\n", |
| 120 | + "if 'agents' in sys.modules:\n", |
| 121 | + " del sys.modules['agents']\n", |
| 122 | + "if 'agents.agent' in sys.modules:\n", |
| 123 | + " del sys.modules['agents.agent']" |
116 | 124 | ] |
117 | 125 | }, |
118 | 126 | { |
|
124 | 132 | }, |
125 | 133 | { |
126 | 134 | "cell_type": "code", |
127 | | - "execution_count": 3, |
| 135 | + "execution_count": 38, |
128 | 136 | "metadata": {}, |
129 | 137 | "outputs": [ |
130 | 138 | { |
131 | 139 | "name": "stdout", |
132 | 140 | "output_type": "stream", |
133 | 141 | "text": [ |
134 | | - "Oh joy, another laptop problem. What's it doing? Blue-screening? Making strange noises? Becoming self-aware? I need details before I can wave my magical tech support wand.\n" |
| 142 | + "*eye roll* Another laptop crisis. What's it doing? Singing off-key? Refusing to work unless you feed it cookies? Details, please.\n" |
135 | 143 | ] |
136 | 144 | } |
137 | 145 | ], |
|
156 | 164 | "source": [ |
157 | 165 | "### Implementation 1: Simple Memory Tool\n", |
158 | 166 | "\n", |
159 | | - "*Implementation borrowed from [Barry Zhang](https://github.com/ItsBarryZ)*. See the agents quick-start tools [here](https://github.com/anthropics/anthropic-quickstarts/tree/main/agents/tools) as well as the Anthropic API tools [docs](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview).\n", |
| 167 | + "*This implementation is a reflection of our agents quickstarts repo [here](https://github.com/anthropics/anthropic-quickstarts/tree/main/agents/tools). For more information on tool use, see the Anthropic API tools [docs](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview).*\n", |
160 | 168 | "\n", |
161 | 169 | "The `SimpleMemory()` tool gives the model a scratchpad to manage memory. This is maintained as a single string that can be read or updated.\n", |
162 | 170 | "\n", |
163 | 171 | "Here we've defined the `read`, `write`, and `edit` actions. Explicitly defining `read` means the model won't have access to the full contents of memory at every turn. We recommend that if you follow this pattern you introduce a separate, shortened summary or metadata object describing the contents of memory and include that in every request (ideally preventing excessive reads).\n", |
164 | 172 | "\n", |
165 | 173 | "\n", |
166 | 174 | "<b>When would you use this?</b>\n", |
167 | | - "- You want to quickly spin up a memory experiment or augment an existing long-context task. Start here if you don't have high conviction around the types of items that need to be stored or if the agent must support many interaction types.\n", |
| 175 | + "\n", |
| 176 | + "You want to quickly spin up a memory experiment or augment an existing long-context task. Start here if you don't have high conviction around the types of items that need to be stored or if the agent must support many interaction types.\n", |
168 | 177 | "\n", |
169 | 178 | "<b><i>General Notes on Tool Use:</i></b> \n", |
170 | 179 | "- Your tool descriptions should be clear and sufficiently detailed. The best way to guide model behavior around tools is by providing direction as to when / under what conditions tools should be used. \n", |
|
173 | 182 | }, |
174 | 183 | { |
175 | 184 | "cell_type": "code", |
176 | | - "execution_count": 4, |
| 185 | + "execution_count": 39, |
177 | 186 | "metadata": {}, |
178 | 187 | "outputs": [], |
179 | 188 | "source": [ |
|
296 | 305 | }, |
297 | 306 | { |
298 | 307 | "cell_type": "code", |
299 | | - "execution_count": null, |
| 308 | + "execution_count": 40, |
300 | 309 | "metadata": {}, |
301 | 310 | "outputs": [], |
302 | 311 | "source": [ |
|
371 | 380 | }, |
372 | 381 | { |
373 | 382 | "cell_type": "code", |
374 | | - "execution_count": null, |
| 383 | + "execution_count": 41, |
375 | 384 | "metadata": {}, |
376 | 385 | "outputs": [], |
377 | 386 | "source": [ |
|
551 | 560 | }, |
552 | 561 | { |
553 | 562 | "cell_type": "code", |
554 | | - "execution_count": 57, |
| 563 | + "execution_count": 43, |
555 | 564 | "metadata": {}, |
556 | 565 | "outputs": [ |
557 | 566 | { |
558 | 567 | "data": { |
559 | 568 | "text/plain": [ |
560 | | - "{'type': 'file',\n", |
561 | | - " 'id': 'file_011CPN5QewZbKuHeB8gL1Fwr',\n", |
562 | | - " 'size_bytes': 32378962,\n", |
563 | | - " 'created_at': '2025-05-22T06:14:19.943000Z',\n", |
564 | | - " 'filename': 'SB1029-ProjectUpdate-FINAL_020317-A11Y.pdf',\n", |
565 | | - " 'mime_type': 'application/pdf',\n", |
566 | | - " 'downloadable': False}" |
| 569 | + "[{'type': 'file',\n", |
| 570 | + " 'id': 'file_011CPaGpXxdBojQLTszA5LGp',\n", |
| 571 | + " 'size_bytes': 544347,\n", |
| 572 | + " 'created_at': '2025-05-28T16:51:06.716000Z',\n", |
| 573 | + " 'filename': 'sample.pdf',\n", |
| 574 | + " 'mime_type': 'application/pdf',\n", |
| 575 | + " 'downloadable': False},\n", |
| 576 | + " {'type': 'file',\n", |
| 577 | + " 'id': 'file_011CPYNG2Sf1cWjuCFhKJFV7',\n", |
| 578 | + " 'size_bytes': 3,\n", |
| 579 | + " 'created_at': '2025-05-27T16:41:15.335000Z',\n", |
| 580 | + " 'filename': 'number.txt',\n", |
| 581 | + " 'mime_type': 'text/plain',\n", |
| 582 | + " 'downloadable': True}]" |
567 | 583 | ] |
568 | 584 | }, |
569 | | - "execution_count": 57, |
| 585 | + "execution_count": 43, |
570 | 586 | "metadata": {}, |
571 | 587 | "output_type": "execute_result" |
572 | 588 | } |
|
645 | 661 | " raise ValueError(f\"Failed to upload file: {res.status_code} - {res.text}\")\n", |
646 | 662 | " \n", |
647 | 663 | "# example usage\n", |
648 | | - "file_path = \"/Users/user/Downloads/SB1029-ProjectUpdate-FINAL_020317-A11Y.pdf\" # REPLACE\n", |
| 664 | + "#file_path = \"/Users/user/Downloads/SB1029-ProjectUpdate-FINAL_020317-A11Y.pdf\" # REPLACE\n", |
649 | 665 | "storage_manager = StorageManager(os.getenv(\"ANTHROPIC_API_KEY\"))\n", |
650 | | - "uploaded = storage_manager.upload_file(file_path)\n", |
651 | | - "storage_manager.get_file_metadata(uploaded['id'])" |
| 666 | + "#uploaded = storage_manager.upload_file(file_path)\n", |
| 667 | + "#storage_manager.get_file_metadata(uploaded['id'])\n", |
| 668 | + "storage_manager.list_files()[:2]" |
652 | 669 | ] |
653 | 670 | }, |
654 | 671 | { |
|
697 | 714 | }, |
698 | 715 | { |
699 | 716 | "cell_type": "code", |
700 | | - "execution_count": 55, |
| 717 | + "execution_count": 44, |
701 | 718 | "metadata": {}, |
702 | 719 | "outputs": [ |
703 | 720 | { |
|
714 | 731 | " └── projects" |
715 | 732 | ] |
716 | 733 | }, |
717 | | - "execution_count": 55, |
| 734 | + "execution_count": 44, |
718 | 735 | "metadata": {}, |
719 | 736 | "output_type": "execute_result" |
720 | 737 | } |
|
743 | 760 | }, |
744 | 761 | { |
745 | 762 | "cell_type": "code", |
746 | | - "execution_count": null, |
| 763 | + "execution_count": 45, |
747 | 764 | "metadata": {}, |
748 | 765 | "outputs": [], |
749 | 766 | "source": [ |
|
1034 | 1051 | "cell_type": "code", |
1035 | 1052 | "execution_count": null, |
1036 | 1053 | "metadata": {}, |
1037 | | - "outputs": [ |
1038 | | - { |
1039 | | - "name": "stderr", |
1040 | | - "output_type": "stream", |
1041 | | - "text": [ |
1042 | | - "/var/folders/40/m42jqbt54j90clf75tsn03kw0000gp/T/ipykernel_92531/3353802839.py:99: DeprecationWarning: on_submit is deprecated. Instead, set the .continuous_update attribute to False and observe the value changing with: mywidget.observe(callback, 'value').\n", |
1043 | | - " self.text_input.on_submit(self.on_send)\n" |
1044 | | - ] |
1045 | | - }, |
1046 | | - { |
1047 | | - "data": { |
1048 | | - "application/vnd.jupyter.widget-view+json": { |
1049 | | - "model_id": "92bc4784ef0c462d9b737c14c040f508", |
1050 | | - "version_major": 2, |
1051 | | - "version_minor": 0 |
1052 | | - }, |
1053 | | - "text/plain": [ |
1054 | | - "HBox(children=(VBox(children=(Label(value='Chat'), Output(layout=Layout(border_bottom='1px solid #ccc', border…" |
1055 | | - ] |
1056 | | - }, |
1057 | | - "execution_count": 77, |
1058 | | - "metadata": {}, |
1059 | | - "output_type": "execute_result" |
1060 | | - } |
1061 | | - ], |
| 1054 | + "outputs": [], |
1062 | 1055 | "source": [ |
1063 | 1056 | "memory_tool = FileBasedMemoryTool() # or SimpleMemory() or CompactifyMemory(client) or FileBasedMemoryTool(storage_manager)\n", |
1064 | 1057 | "model_config = {\n", |
|
0 commit comments