|
13 | 13 | "id": "eeab798a",
|
14 | 14 | "metadata": {},
|
15 | 15 | "source": [
|
16 |
| - "AI agents often operate in **long-running, multi-turn interactions**, where keeping the right balance of context is critical. If too much is carried forward, the model risks distraction, inefficiency, or outright failure. If too little is preserved, the agent loses coherence. This guide focuses on two proven context management techniques—**trimming** and **compression**—to keep agents fast, reliable, and cost-efficient.\n", |
| 16 | + "AI agents often operate in **long-running, multi-turn interactions**, where keeping the right balance of **context** is critical. If too much is carried forward, the model risks distraction, inefficiency, or outright failure. If too little is preserved, the agent loses coherence. \n", |
17 | 17 | "\n",
|
18 |
| - "In this cookbook, we’ll explore how to **manage context effectively using the `Session` object from the [OpenAI Agents SDK](https://github.com/openai/openai-agents-python)**.\n", |
| 18 | + "Here, context refers to the total window of tokens (input + output) that the model can attend to at once. For [GPT-5](https://platform.openai.com/docs/models/gpt-5), this capacity is up to 272k input tokens and 128k output tokens but even such a large window can be overwhelmed by uncurated histories, redundant tool results, or noisy retrievals. This makes context management not just an optimization, but a necessity.\n", |
| 19 | + "\n", |
| 20 | + "In this cookbook, we’ll explore how to **manage context effectively using the `Session` object from the [OpenAI Agents SDK](https://github.com/openai/openai-agents-python)**, focusing on two proven context management techniques—**trimming** and **compression**—to keep agents fast, reliable, and cost-efficient.\n", |
19 | 21 | "\n",
|
20 | 22 | "#### Why Context Management Matters\n",
|
21 | 23 | "\n",
|
|
35 | 37 | ""
|
36 | 38 | ]
|
37 | 39 | },
|
| 40 | + { |
| 41 | + "cell_type": "markdown", |
| 42 | + "id": "4ae8fdc3", |
| 43 | + "metadata": {}, |
| 44 | + "source": [ |
| 45 | + "The [OpenAI Responses API](https://platform.openai.com/docs/api-reference/responses/create#responses-create-previous_response_id) includes **basic memory support** through built-in state and message chaining with `previous_response_id`.\n", |
| 46 | + "\n", |
| 47 | + "You can continue a conversation by passing the prior response’s `id` as `previous_response_id`, or you can manage context manually by collecting outputs into a list and resubmitting them as the `input` for the next response.\n", |
| 48 | + "\n", |
| 49 | + "What you don’t get is **automatic memory management**. That’s where the **Agents SDK** comes in. It provides [session memory](https://openai.github.io/openai-agents-python/sessions/) on top of Responses, so you no longer need to manually append `response.output` or track IDs yourself. The session becomes the **memory object**: you simply call `session.run(\"...\")` repeatedly, and the SDK handles context length, history, and continuity—making it far easier to build coherent, multi-turn agents." |
| 50 | + ] |
| 51 | + }, |
38 | 52 | {
|
39 | 53 | "cell_type": "markdown",
|
40 | 54 | "id": "7068564c",
|
|
50 | 64 | "\n",
|
51 | 65 | "#### Techniques Covered\n",
|
52 | 66 | "\n",
|
53 |
| - "To address these challenges, we introduce two concrete approaches using OpenAI Agents SDK:\n", |
| 67 | + "To address these challenges, we introduce two separate concrete approaches using OpenAI Agents SDK:\n", |
| 68 | + "\n", |
| 69 | + "- **Context Trimming** – dropping older turns while keeping the last N turns.\n", |
| 70 | + " - **Pros**\n", |
| 71 | + "\n", |
| 72 | + " * **Deterministic & simple:** No summarizer variability; easy to reason about state and to reproduce runs.\n", |
| 73 | + " * **Zero added latency:** No extra model calls to compress history.\n", |
| 74 | + " * **Fidelity for recent work:** Latest tool results, parameters, and edge cases stay verbatim—great for debugging.\n", |
| 75 | + " * **Lower risk of “summary drift”:** You never reinterpret or compress facts.\n", |
| 76 | + "\n", |
| 77 | + " **Cons**\n", |
54 | 78 | "\n",
|
55 |
| - "1. **Trimming Messages** – dropping older turns while keeping the last N turns.\n", |
56 |
| - "2. **Summarizing Messages** – compressing prior exchanges into structured, shorter representations.\n", |
| 79 | + " * **Forgets long-range context abruptly:** Important earlier constraints, IDs, or decisions can vanish once they scroll past N.\n", |
| 80 | + " * **User experience “amnesia”:** Agent can appear to “forget” promises or prior preferences midway through long sessions.\n", |
| 81 | + " * **Wasted signal:** Older turns may contain reusable knowledge (requirements, constraints) that gets dropped.\n", |
| 82 | + " * **Token spikes still possible:** If a recent turn includes huge tool payloads, your last-N can still blow up the context.\n", |
57 | 83 | "\n",
|
| 84 | + " - **Best when**\n", |
58 | 85 | "\n",
|
| 86 | + " - Your tasks in the conversation is indepentent from each other with non-overlapping context that does not reuqire carrying previous details further.\n", |
| 87 | + " - You need predictability, easy evals, and low latency (ops automations, CRM/API actions).\n", |
| 88 | + " - The conversation’s useful context is local (recent steps matter far more than distant history).\n", |
| 89 | + "\n", |
| 90 | + "- **Context Summarization** – compressing prior messages(assistant, user, tools, etc.) into structured, shorter summaries injected into the conversation history.\n", |
| 91 | + "\n", |
| 92 | + " - **Pros**\n", |
| 93 | + "\n", |
| 94 | + " * **Retains long-range memory compactly:** Past requirements, decisions, and rationales persist beyond N.\n", |
| 95 | + " * **Smoother UX:** Agent “remembers” commitments and constraints across long sessions.\n", |
| 96 | + " * **Cost-controlled scale:** One concise summary can replace hundreds of turns.\n", |
| 97 | + " * **Searchable anchor:** A single synthetic assistant message becomes a stable “state of the world so far.”\n", |
| 98 | + "\n", |
| 99 | + " **Cons**\n", |
| 100 | + "\n", |
| 101 | + " * **Summarization loss & bias:** Details can be dropped or misweighted; subtle constraints may vanish.\n", |
| 102 | + " * **Latency & cost spikes:** Each refresh adds model work (and potentially tool-trim logic).\n", |
| 103 | + " * **Compounding errors:** If a bad fact enters the summary, it can **poison** future behavior (“context poisoning”).\n", |
| 104 | + " * **Observability complexity:** You must log summary prompts/outputs for auditability and evals.\n", |
| 105 | + "\n", |
| 106 | + " - **Best when**\n", |
| 107 | + "\n", |
| 108 | + " - You have use cases where your tasks needs context collected accross the flow such as planning/coaching, RAG-heavy analysis, policy Q&A.\n", |
| 109 | + " - You need continuity over long horizons and carry the important details further to solve related tasks.\n", |
| 110 | + " - Sessions exceed N turns but must preserve decisions, IDs, and constraints reliably.\n", |
59 | 111 | "<br>"
|
60 | 112 | ]
|
61 | 113 | },
|
| 114 | + { |
| 115 | + "cell_type": "markdown", |
| 116 | + "id": "3765f2b8", |
| 117 | + "metadata": {}, |
| 118 | + "source": [ |
| 119 | + "**Quick comparison**" |
| 120 | + ] |
| 121 | + }, |
| 122 | + { |
| 123 | + "cell_type": "markdown", |
| 124 | + "id": "940e5bf7", |
| 125 | + "metadata": {}, |
| 126 | + "source": [ |
| 127 | + "| Dimension | **Trimming (last-N turns)** | **Summarizing (older → generated summary)** |\n", |
| 128 | + "| ----------------- | ------------------------------- | ------------------------------------ |\n", |
| 129 | + "| Latency / Cost | Lowest (no extra calls) | Higher at summary refresh points |\n", |
| 130 | + "| Long-range recall | Weak (hard cut-off) | Strong (compact carry-forward) |\n", |
| 131 | + "| Risk type | Context loss | Context distortion/poisoning |\n", |
| 132 | + "| Observability | Simple logs | Must log summary prompts/outputs |\n", |
| 133 | + "| Eval stability | High | Needs robust summary evals |\n", |
| 134 | + "| Best for | Tool-heavy ops, short workflows | Analyst/concierge, long threads |\n" |
| 135 | + ] |
| 136 | + }, |
62 | 137 | {
|
63 | 138 | "cell_type": "markdown",
|
64 | 139 | "id": "fc613968",
|
|
68 | 143 | "\n",
|
69 | 144 | "Before running this cookbook, you must set up the following accounts and complete a few setup actions. These prerequisites are essential to interact with the APIs used in this project.\n",
|
70 | 145 | "\n",
|
71 |
| - "#### Step0: OpenAI Account\n", |
| 146 | + "#### Step0: OpenAI Account and `OPENAI_API_KEY`\n", |
72 | 147 | "\n",
|
73 | 148 | "- **Purpose:** \n",
|
74 | 149 | " You need an OpenAI account to access language models and use the Agents SDK featured in this cookbook.\n",
|
|
77 | 152 | " [Sign up for an OpenAI account](https://openai.com) if you don’t already have one. Once you have an account, create an API key by visiting the [OpenAI API Keys page](https://platform.openai.com/api-keys)."
|
78 | 153 | ]
|
79 | 154 | },
|
| 155 | + { |
| 156 | + "cell_type": "markdown", |
| 157 | + "id": "094205e7", |
| 158 | + "metadata": {}, |
| 159 | + "source": [ |
| 160 | + "**Before running the workflow, set your environment variables:**\n", |
| 161 | + "\n", |
| 162 | + "```\n", |
| 163 | + "# Your openai key\n", |
| 164 | + "os.environ[\"OPENAI_API_KEY\"] = \"sk-proj-...\"\n", |
| 165 | + "```\n", |
| 166 | + "\n", |
| 167 | + "Alternatively, you can set your OpenAI API key for use by the agents via the `set_default_openai_key` function by importing agents library .\n", |
| 168 | + "\n", |
| 169 | + "```\n", |
| 170 | + "from agents import set_default_openai_key\n", |
| 171 | + "set_default_openai_key(\"YOUR_API_KEY\")\n", |
| 172 | + "```" |
| 173 | + ] |
| 174 | + }, |
80 | 175 | {
|
81 | 176 | "cell_type": "markdown",
|
82 | 177 | "id": "3cd9a109",
|
83 | 178 | "metadata": {},
|
84 | 179 | "source": [
|
85 | 180 | "#### Step1: Install the Required Libraries\n",
|
86 | 181 | "\n",
|
87 |
| - "Below we install the `openai-agents` library (the [OpenAI Agents SDK](https://github.com/openai/openai-agents-python)" |
| 182 | + "Below we install the `openai-agents` library ([OpenAI Agents SDK](https://github.com/openai/openai-agents-python))" |
88 | 183 | ]
|
89 | 184 | },
|
90 | 185 | {
|
|
130 | 225 | },
|
131 | 226 | {
|
132 | 227 | "cell_type": "code",
|
133 |
| - "execution_count": 3, |
| 228 | + "execution_count": null, |
134 | 229 | "id": "fe54469a",
|
135 | 230 | "metadata": {},
|
136 | 231 | "outputs": [
|
|
201 | 296 | "id": "b8074e05",
|
202 | 297 | "metadata": {},
|
203 | 298 | "source": [
|
204 |
| - "## 1. Context Trimming " |
| 299 | + "## Context Trimming" |
205 | 300 | ]
|
206 | 301 | },
|
207 | 302 | {
|
|
577 | 672 | "id": "d6fa349f",
|
578 | 673 | "metadata": {},
|
579 | 674 | "source": [
|
580 |
| - "## 2. Context Summarization " |
| 675 | + "## Context Summarization" |
581 | 676 | ]
|
582 | 677 | },
|
583 | 678 | {
|
|
1150 | 1245 | },
|
1151 | 1246 | {
|
1152 | 1247 | "cell_type": "code",
|
1153 |
| - "execution_count": 248, |
| 1248 | + "execution_count": null, |
1154 | 1249 | "id": "5448ce93",
|
1155 | 1250 | "metadata": {},
|
1156 | 1251 | "outputs": [],
|
1157 | 1252 | "source": [
|
1158 |
| - "full_history = await session.get_items_with_metadata()\n" |
| 1253 | + "full_history = await session.get_items_with_metadata()" |
1159 | 1254 | ]
|
1160 | 1255 | },
|
1161 | 1256 | {
|
|
0 commit comments