Skip to content

Commit 7d55dce

Browse files
corrected typos and formatting
1 parent 1f3221e commit 7d55dce

File tree

1 file changed

+10
-7
lines changed

1 file changed

+10
-7
lines changed

examples/o-series/o3o4-mini_prompting_guide.ipynb

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@
5656
"\n",
5757
"As a retail agent, you can help users cancel or modify pending orders, return or exchange delivered orders, modify their default user address, or provide information about their own profile, orders, and related products.\n",
5858
"```\n",
59-
"2.Function Call ordering: o3/o4-mini are trained to accomplish goals with tools. However, it can make mistakes in the order of the tool calls. To guard against these cases, it is recommended to explicitly outline the orders to accomplish certain tasks. For example, to guard against the failure case that a coding agent possibly making a file in a directory that does not yet exist, adding the following will usually suffice:\n",
59+
"2. Function Call ordering: o3/o4-mini are trained to accomplish goals with tools. However, it can make mistakes in the order of the tool calls. To guard against these cases, it is recommended to explicitly outline the orders to accomplish certain tasks. For example, to guard against the failure case that a coding agent possibly making a file in a directory that does not yet exist, adding the following will usually suffice:\n",
6060
"```\n",
6161
"check to see if directories exist before making files\n",
6262
"```\n",
@@ -286,17 +286,20 @@
286286
"\n",
287287
"**A:** For o3 and o4-mini models, there is no hard upper limit on the number of functions, but practical guidance does exist based on both training data distribution and observed model behavior. As of May 2025, any setup with fewer than ~100 tools and fewer than ~20 arguments per tool is considered in-distribution and should perform within expected reliability bounds. Performance still depends on your prompt design and task complexity. \n",
288288
"\n",
289-
"Even if you’re technically within training distribution, more tools can introduce ambiguity or confusion. Here are key considerations:\n",
290-
"Function description clarity becomes critical: If multiple tools have overlapping purposes or vague descriptions, models may call the wrong one or hesitate to call any at all .\n",
291-
"Tool list size can affect latency and reasoning depth: Longer lists mean the model has more options to parse during its reasoning phase. While o3/o4-mini can handle this with their integrated reasoning pipelines, performance can degrade if schema clarity or invocation conditions aren’t sharp.\n",
292-
"Tool hallucinations can increase with complexity: Especially with o3, there have been reports of hallucinated or speculative tool calls when the toolset is large and under-defined. Explicit instructions help mitigate this (e.g., “Only use tools X, Y, Z. Do not invent tool calls or defer them to future turns.”)\n",
289+
"Even if you are technically within training distribution, more tools can introduce ambiguity or confusion. Here are key considerations:\n",
290+
"\n",
291+
"* Function description clarity becomes critical: If multiple tools have overlapping purposes or vague descriptions, models may call the wrong one or hesitate to call any at all.\n",
292+
"\n",
293+
"* Tool list size can affect latency and reasoning depth: Longer lists mean the model has more options to parse during its reasoning phase. While o3/o4-mini can handle this with their integrated reasoning pipelines, performance can degrade if schema clarity or invocation conditions aren’t sharp.\n",
294+
"\n",
295+
"* Tool hallucinations can increase with complexity: Especially with o3, there have been reports of hallucinated or speculative tool calls when the toolset is large and under-defined. Explicit instructions help mitigate this (e.g., “Only use tools X, Y, Z. Do not invent tool calls or defer them to future turns.”)\n",
293296
"\n",
294297
"Ultimately, the performance will defer depending on the use case; Therefore it is important to invest in evals that you trust you can use to iterate on.\n",
295298
"\n",
296299
"\n",
297300
"**Q: Is it OK to have deeply nested params within tools or should I \"flatten\" out the schema?**\n",
298301
"\n",
299-
"**A:** There is again no hard guidance. However, even if your nesting structure is technically supported, deeply layered argument trees can impact performance or reliability. When in doubt we recommend you err on the side of making the arguments flat:\n",
302+
"**A:** There is again no hard guidance. However, even if your nesting structure is technically supported, deeply layered argument trees can impact performance or reliability. When in doubt we recommend you err on the side of making the arguments flat.\n",
300303
"\n",
301304
"Flat structures are often easier for the model to reason about: In flatter schemas, argument fields are top-level and immediately visible. This reduces the need for internal parsing and structuring, which can help prevent issues like partially filled nested objects or invalid field combinations. With deeply nested objects, especially ones with repeated or semantically similar field names, the model is more likely to omit or misuse arguments.\n",
302305
"\n",
@@ -307,7 +310,7 @@
307310
"\n",
308311
"**Q: Does this function-calling guidance apply to custom tool formats?**\n",
309312
"\n",
310-
"**A:** Not guaranteed. The guidance in this document assumes you’re using the standard tools model parameter to pass your function schemas. Our o3/o4-mini models are trained to understand and use these schemas natively for tool selection and argument construction.\n",
313+
"**A:** Not guaranteed. The guidance in this document assumes you’re using the standard `tools` model parameter to pass your function schemas, as shown in our [general guide](https://platform.openai.com/docs/guides/function-calling) on function calling. Our o3/o4-mini models are trained to understand and use these schemas natively for tool selection and argument construction.\n",
311314
"\n",
312315
"If you’re instead providing custom tool definitions via natural language in a developer-authored prompt (e.g., defining tools inline in the developer message or user message), this guidance may not fully apply. In those cases:\n",
313316
"The model is not relying on its internal tool-schema priors\n",

0 commit comments

Comments
 (0)