Merge pull request anthropics#154 from anthropics/alexander/memory-cookbook-suggestions

etd-23 · web-flow · commit 66ee346c51bd · 2025-05-29T23:07:52.000-07:00
Memory cookbook suggestions
diff --git a/tool_use/memory_cookbook.ipynb b/tool_use/memory_cookbook.ipynb
@@ -29,12 +29,18 @@
    "source": [
     "### Introduction\n",
     "\n",
-    "Managing memory effectively is a critical part of building agents and agentic workflows that handle long-horizon tasks. In this cookbook we're going to demonstrate a few different strategies for \"self-managed\" (llm-managed) memory. Use this notebook as a starting point for your own memory implementations. We do not expect that memory tools are one-size-fits-all, and further believe that different domains/tasks necessarily lend themselves to more or less rigid memory scaffolding. The Claude 4 model family has proven to be particularly strong at utilizing memory tooling, and we're excited to see how teams extend the ideas below.\n",
+    "Managing memory effectively is a critical part of building agents and agentic workflows that handle long-horizon tasks. In this cookbook we demonstrate a few different strategies for \"self-managed\" (LLM-managed) memory. Use this notebook as a starting point for your own memory implementations. We do not expect that memory tools are one-size-fits-all, and further believe that different domains/tasks necessarily lend themselves to more or less rigid memory scaffolding. The Claude 4 model family has proven to be particularly strong at utilizing [memory tooling](https://www.anthropic.com/news/claude-4#:~:text=more%20on%20methodology.-,Model%20improvements,-In%20addition%20to), and we're excited to see how teams extend the ideas below.\n",
     "\n",
     "\n",
     "#### Why do we need to manage memory?\n",
     "\n",
-    "LLMs have finite context windows (200k tokens for Claude-4 Sonnet & Opus). Tactically this means that any request > 200k tokens will be truncated. As many teams building with LLMs quickly learn, there is additional complexity in identifying and working within the *effective* context window of an LLM. Often, in practice, most tasks see performance degregation at thresholds significantly less that the maximum available context window. Successfully building LLM-based systems is an exercise in discarding the unnecessary tokens and efficiently storing + retrieving the relevant tokens for the task at hand."
+    "LLMs have finite context windows (200k tokens for Claude 4 Sonnet & Opus). This means that for any request, if the sum of prompt tokens and output tokens exceeds the model’s context window, the system will return a validation error. As many teams building with LLMs quickly learn, there is additional complexity in identifying and working within the *effective* [context window](https://docs.anthropic.com/en/docs/build-with-claude/context-windows) of an LLM. See our tips for [long context prompting](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips) to learn more about effective context windows and best practices.\n",
+    "\n",
+    "In addition to the above, memory is important for the following reasons:\n",
+    "-  **Long context windows are computationally expensive:** Attention mechanisms scale quadratically—doubling context length quadruples compute cost. Most tasks only need a small fraction of available context, making it wasteful to process millions of irrelevant tokens. This is why humans don't memorize entire textbooks; we take notes and build mental models instead.\n",
+    "- **More efficient processing:** When LLMs write and maintain their own notes—saving successful strategies, key insights, and relevant context—they're effectively updating their capabilities in real-time without retraining. Models that excel at these operations can maintain coherent behavior over extremely long time horizons while using only a fraction of the computational resources required for full context windows.\n",
+    "\n",
+    "Successfully building LLM-based systems is an exercise in discarding the unnecessary tokens and efficiently storing + retrieving the relevant tokens for the task at-hand."
    ]
   },
   {
@@ -46,25 +52,17 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": null,
    "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Note: you may need to restart the kernel to use updated packages.\n"
-     ]
-    }
-   ],
+   "outputs": [],
    "source": [
     "# install deps\n",
     "%pip install -q -U anthropic python-dotenv nest_asyncio PyPDF2"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 36,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -92,27 +90,37 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 37,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "fatal: destination path '/tmp/anthropic-quickstarts' already exists and is not an empty directory.\n"
+      "Repository already exists at /tmp/anthropic-quickstarts\n"
      ]
     }
    ],
    "source": [
     "import sys \n",
+    "import os\n",
     "\n",
-    "# clone the agents quickstart implementation\n",
-    "!git clone https://github.com/anthropics/anthropic-quickstarts.git /tmp/anthropic-quickstarts\n",
-    "\n",
-    "# navigate to the agents quickstart implementation\n",
-    "!cd /tmp/anthropic-quickstarts\n",
-    "\n",
-    "sys.path.append(os.path.abspath('.'))"
+    "# Check if the repo already exists\n",
+    "if not os.path.exists('/tmp/anthropic-quickstarts'):\n",
+    "    # Clone the agents quickstart implementation\n",
+    "    !git clone https://github.com/anthropics/anthropic-quickstarts.git /tmp/anthropic-quickstarts\n",
+    "else:\n",
+    "    print(\"Repository already exists at /tmp/anthropic-quickstarts\")\n",
+    "\n",
+    "# IMPORTANT: Insert at the beginning of sys.path to override any existing 'agents' modules\n",
+    "if '/tmp/anthropic-quickstarts' not in sys.path:\n",
+    "    sys.path.insert(0, '/tmp/anthropic-quickstarts')\n",
+    "\n",
+    "# Clear any cached imports of 'agents' module\n",
+    "if 'agents' in sys.modules:\n",
+    "    del sys.modules['agents']\n",
+    "if 'agents.agent' in sys.modules:\n",
+    "    del sys.modules['agents.agent']"
    ]
   },
   {
@@ -124,14 +132,14 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 38,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Oh joy, another laptop problem. What's it doing? Blue-screening? Making strange noises? Becoming self-aware? I need details before I can wave my magical tech support wand.\n"
+      "*eye roll* Another laptop crisis. What's it doing? Singing off-key? Refusing to work unless you feed it cookies? Details, please.\n"
      ]
     }
    ],
@@ -156,15 +164,16 @@
    "source": [
     "### Implementation 1: Simple Memory Tool\n",
     "\n",
-    "*Implementation borrowed from [Barry Zhang](https://github.com/ItsBarryZ)*. See the agents quick-start tools [here](https://github.com/anthropics/anthropic-quickstarts/tree/main/agents/tools) as well as the Anthropic API tools [docs](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview).\n",
+    "*This implementation is a reflection of our agents quickstarts repo [here](https://github.com/anthropics/anthropic-quickstarts/tree/main/agents/tools). For more information on tool use, see the Anthropic API tools [docs](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview).*\n",
     "\n",
     "The `SimpleMemory()` tool gives the model a scratchpad to manage memory. This is maintained as a single string that can be read or updated.\n",
     "\n",
     "Here we've defined the `read`, `write`, and `edit` actions. Explicitly defining `read` means the model won't have access to the full contents of memory at every turn. We recommend that if you follow this pattern you introduce a separate, shortened summary or metadata object describing the contents of memory and include that in every request (ideally preventing excessive reads).\n",
     "\n",
     "\n",
     "<b>When would you use this?</b>\n",
-    "- You want to quickly spin up a memory experiment or augment an existing long-context task. Start here if you don't have high conviction around the types of items that need to be stored or if the agent must support many interaction types.\n",
+    "\n",
+    "You want to quickly spin up a memory experiment or augment an existing long-context task. Start here if you don't have high conviction around the types of items that need to be stored or if the agent must support many interaction types.\n",
     "\n",
     "<b><i>General Notes on Tool Use:</i></b> \n",
     "- Your tool descriptions should be clear and sufficiently detailed. The best way to guide model behavior around tools is by providing direction as to when / under what conditions tools should be used. \n",
@@ -173,7 +182,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 39,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -296,7 +305,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 40,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -371,7 +380,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 41,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -551,22 +560,29 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 57,
+   "execution_count": 43,
    "metadata": {},
    "outputs": [
     {
      "data": {
       "text/plain": [
-       "{'type': 'file',\n",
-       " 'id': 'file_011CPN5QewZbKuHeB8gL1Fwr',\n",
-       " 'size_bytes': 32378962,\n",
-       " 'created_at': '2025-05-22T06:14:19.943000Z',\n",
-       " 'filename': 'SB1029-ProjectUpdate-FINAL_020317-A11Y.pdf',\n",
-       " 'mime_type': 'application/pdf',\n",
-       " 'downloadable': False}"
+       "[{'type': 'file',\n",
+       "  'id': 'file_011CPaGpXxdBojQLTszA5LGp',\n",
+       "  'size_bytes': 544347,\n",
+       "  'created_at': '2025-05-28T16:51:06.716000Z',\n",
+       "  'filename': 'sample.pdf',\n",
+       "  'mime_type': 'application/pdf',\n",
+       "  'downloadable': False},\n",
+       " {'type': 'file',\n",
+       "  'id': 'file_011CPYNG2Sf1cWjuCFhKJFV7',\n",
+       "  'size_bytes': 3,\n",
+       "  'created_at': '2025-05-27T16:41:15.335000Z',\n",
+       "  'filename': 'number.txt',\n",
+       "  'mime_type': 'text/plain',\n",
+       "  'downloadable': True}]"
       ]
      },
-     "execution_count": 57,
+     "execution_count": 43,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -645,10 +661,11 @@
     "            raise ValueError(f\"Failed to upload file: {res.status_code} - {res.text}\")\n",
     "        \n",
     "# example usage\n",
-    "file_path = \"/Users/user/Downloads/SB1029-ProjectUpdate-FINAL_020317-A11Y.pdf\" # REPLACE\n",
+    "#file_path = \"/Users/user/Downloads/SB1029-ProjectUpdate-FINAL_020317-A11Y.pdf\" # REPLACE\n",
     "storage_manager = StorageManager(os.getenv(\"ANTHROPIC_API_KEY\"))\n",
-    "uploaded = storage_manager.upload_file(file_path)\n",
-    "storage_manager.get_file_metadata(uploaded['id'])"
+    "#uploaded = storage_manager.upload_file(file_path)\n",
+    "#storage_manager.get_file_metadata(uploaded['id'])\n",
+    "storage_manager.list_files()[:2]"
    ]
   },
   {
@@ -697,7 +714,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 55,
+   "execution_count": 44,
    "metadata": {},
    "outputs": [
     {
@@ -714,7 +731,7 @@
        "    └── projects"
       ]
      },
-     "execution_count": 55,
+     "execution_count": 44,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -743,7 +760,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 45,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -1034,31 +1051,7 @@
    "cell_type": "code",
    "execution_count": null,
    "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "/var/folders/40/m42jqbt54j90clf75tsn03kw0000gp/T/ipykernel_92531/3353802839.py:99: DeprecationWarning: on_submit is deprecated. Instead, set the .continuous_update attribute to False and observe the value changing with: mywidget.observe(callback, 'value').\n",
-      "  self.text_input.on_submit(self.on_send)\n"
-     ]
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "92bc4784ef0c462d9b737c14c040f508",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "HBox(children=(VBox(children=(Label(value='Chat'), Output(layout=Layout(border_bottom='1px solid #ccc', border…"
-      ]
-     },
-     "execution_count": 77,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
+   "outputs": [],
    "source": [
     "memory_tool = FileBasedMemoryTool() # or SimpleMemory() or CompactifyMemory(client) or FileBasedMemoryTool(storage_manager)\n",
     "model_config = {\n",