You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/prompt-caching.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -75,16 +75,16 @@ A single character difference in the first 1,024 tokens will result in a cache m
75
75
76
76
## What is cached?
77
77
78
-
The o1-series models are text only and don't support system messages, images, tool use/function calling, or structured outputs. This limits the efficacy of prompt caching for these models to the user/assistant portions of the messages array which are less likely to have an identical 1024 token prefix.
78
+
o1-series models feature support varies by model. For more details, see our dedicated [reasoning models guide](./reasoning.md).
|**Messages**| The complete messages array: system, user, and assistant content |`gpt-4o`<br/>`gpt-4o-mini`<br/>`gpt-4o-realtime-preview` (version 2024-12-17) |
85
-
|**Images**| Images included in user messages, both as links or as base64-encoded data. The detail parameter must be set the same across requests. |`gpt-4o`<br/>`gpt-4o-mini`|
86
-
|**Tool use**| Both the messages array and tool definitions. |`gpt-4o`<br/>`gpt-4o-mini`<br/>`gpt-4o-realtime-preview` (version 2024-12-17) |
87
-
|**Structured outputs**| Structured output schema is appended as a prefix to the system message. |`gpt-4o`<br/>`gpt-4o-mini`|
84
+
|**Messages**| The complete messages array: system, developer, user, and assistant content |`gpt-4o`<br/>`gpt-4o-mini`<br/>`gpt-4o-realtime-preview` (version 2024-12-17) <br> `o1` (version 2024-12-17) |
85
+
|**Images**| Images included in user messages, both as links or as base64-encoded data. The detail parameter must be set the same across requests. |`gpt-4o`<br/>`gpt-4o-mini`<br> `o1` (version 2024-12-17) |
86
+
|**Tool use**| Both the messages array and tool definitions. |`gpt-4o`<br/>`gpt-4o-mini`<br/>`gpt-4o-realtime-preview` (version 2024-12-17) <br> `o1` (version 2024-12-17) |
87
+
|**Structured outputs**| Structured output schema is appended as a prefix to the system message. |`gpt-4o`<br/>`gpt-4o-mini`<br> `o1` (version 2024-12-17) |
88
88
89
89
To improve the likelihood of cache hits occurring, you should structure your requests such that repetitive content occurs at the beginning of the messages array.
0 commit comments