Updated notebook based on PR feedback

tompakeman-oai · tompakeman-oai · commit 48aefeef4a44 · 2025-04-28T09:42:35.000+01:00
diff --git a/examples/reasoning_function_calls.ipynb b/examples/reasoning_function_calls.ipynb
@@ -5,8 +5,8 @@
    "metadata": {},
    "source": [
     "# Managing Function Calls With Reasoning Models\n",
-    "OpenAI now offers [reasoning models](https://platform.openai.com/docs/guides/reasoning?api-mode=responses) which are trained to follow logical chains of thought, making them better suited for complex or multi-step tasks.\n",
-    "> \"_Reasoning models like o3 and o4-mini are LLMs trained with reinforcement learning to perform reasoning. Reasoning models think before they answer, producing a long internal chain of thought before responding to the user. Reasoning models excel in complex problem solving, coding, scientific reasoning, and multi-step planning for agentic workflows. They're also the best models for Codex CLI, our lightweight coding agent._\"\n",
+    "OpenAI now offers function calling using [reasoning models](https://platform.openai.com/docs/guides/reasoning?api-mode=responses). Reasoning models are trained to follow logical chains of thought, making them better suited for complex or multi-step tasks.\n",
+    "> _Reasoning models like o3 and o4-mini are LLMs trained with reinforcement learning to perform reasoning. Reasoning models think before they answer, producing a long internal chain of thought before responding to the user. Reasoning models excel in complex problem solving, coding, scientific reasoning, and multi-step planning for agentic workflows. They're also the best models for Codex CLI, our lightweight coding agent._\n",
     "\n",
     "For the most part, using these models via the API is very simple and comparable to using familiar classic 'chat' models. \n",
     "\n",
@@ -30,11 +30,12 @@
    "source": [
     "# pip install openai\n",
     "# Import libraries \n",
-    "import json, openai\n",
+    "import json\n",
+    "from openai import OpenAI\n",
     "from uuid import uuid4\n",
     "from typing import Callable\n",
     "\n",
-    "client = openai.OpenAI()\n",
+    "client = OpenAI()\n",
     "MODEL_DEFAULTS = {\n",
     "    \"model\": \"o4-mini\", # 200,000 token context window\n",
     "    \"reasoning\": {\"effort\": \"low\", \"summary\": \"auto\"}, # Automatically summarise the reasoning process. Can also choose \"detailed\" or \"none\"\n",
@@ -65,10 +66,17 @@
     }
    ],
    "source": [
-    "# Let's keep track of the response ids in a naive way, in case we want to reverse the conversation and pick up from a previous point\n",
-    "response = client.responses.create(input=\"Which of the last four Olympic host cities has the highest average temperature?\", **MODEL_DEFAULTS)\n",
+    "response = client.responses.create(\n",
+    "    input=\"Which of the last four Olympic host cities has the highest average temperature?\",\n",
+    "    **MODEL_DEFAULTS\n",
+    ")\n",
     "print(response.output_text)\n",
-    "response = client.responses.create(input=\"what about the lowest?\", previous_response_id=response.id, **MODEL_DEFAULTS)\n",
+    "\n",
+    "response = client.responses.create(\n",
+    "    input=\"what about the lowest?\",\n",
+    "    previous_response_id=response.id,\n",
+    "    **MODEL_DEFAULTS\n",
+    ")\n",
     "print(response.output_text)"
    ]
   },
@@ -397,8 +405,8 @@
     "## Manual conversation orchestration\n",
     "So far so good! It's really cool to watch the model pause execution to run a function before continuing. \n",
     "In practice the example above is quite trivial, and production use cases may be much more complex:\n",
-    "* Our context window may grow too large and we may wish to prune older and less relevant messages\n",
-    "* We may not wish to proceed sequentially using the `previous_response_id` but allow users to navigate back and forth through the conversation and re-generate answers\n",
+    "* Our context window may grow too large and we may wish to prune older and less relevant messages, or summarize the conversation so far\n",
+    "* We may wish to allow users to navigate back and forth through the conversation and re-generate answers\n",
     "* We may wish to store messages in our own database for audit purposes rather than relying on OpenAI's storage and orchestration\n",
     "* etc.\n",
     "\n",
@@ -526,15 +534,22 @@
    "metadata": {},
    "source": [
     "## Summary\n",
-    "* Reasoning models can invoke custom functions during their reasoning process, allowing for complex workflows that require external data or operations.\n",
-    "* These models may require multiple function calls in series, as some steps depend on the results of previous ones, necessitating a loop to handle ongoing reasoning.\n",
-    "* It's essential to preserve reasoning and function call responses in the conversation history to maintain the chain-of-thought and avoid errors in the reasoning process.\n"
+    "In this cookbook, we identified how to combine function calling with OpenAI's reasoning models to demonstrate multi-step tasks that are dependent on external data sources. \n",
+    "\n",
+    "Importantly, we covered reasoning-model specific nuances in the function calling process, specifically that:\n",
+    "* The model may choose to make multiple function calls or reasoning steps in series, and some steps may depend on the results of previous ones\n",
+    "* We cannot know how many of these steps there will be, so we must process responses with a loop\n",
+    "* The responses API makes orchestration easy using the `previous_response_id` parameter, but where manual control is needed, it's important to maintain the correct order of conversation item to preserve the 'chain-of-thought'\n",
+    "\n",
+    "---\n",
+    "\n",
+    "The examples used here are rather simple, but you can imagine how this technique could be extended to more real-world use cases, such as:\n",
+    "\n",
+    "* Looking up a customer's transaction history and recent correspondence to determine if they are eligible for a promotional offer\n",
+    "* Calling recent transaction logs, geolocation data, and device metadata to assess the likelihood of a transaction being fraudulent\n",
+    "* Reviewing internal HR databases to fetch an employee’s benefits usage, tenure, and recent policy changes to answer personalized HR questions\n",
+    "* Reading internal dashboards, competitor news feeds, and market analyses to compile a daily executive briefing tailored to their focus areas"
    ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": []
   }
  ],
  "metadata": {