|
20 | 20 | "\n",
|
21 | 21 | "The new gpt-realtime model delivers stronger instruction following, more reliable tool calling, noticeably better voice quality, and an overall smoother feel. These gains make it practical to move from chained approaches to true realtime experiences, cutting latency and producing responses that sound more natural and expressive.\n",
|
22 | 22 | "\n",
|
23 |
| - "Realtime model benefit from different prompting techniques that wouldn't directly apply to text based models. This prompting guide starts with a suggested prompt skeleton, then walks through each part with practical tips, small patterns you can copy, and examples you can adapt to your use case.\n", |
| 23 | + "Realtime model benefits from different prompting techniques that wouldn't directly apply to text based models. This prompting guide starts with a suggested prompt skeleton, then walks through each part with practical tips, small patterns you can copy, and examples you can adapt to your use case.\n", |
24 | 24 | "\n",
|
25 | 25 | "# Table of Contents\n",
|
26 | 26 | "\n",
|
|
33 | 33 | " - [Language Constraint](#language-constraint)\n",
|
34 | 34 | " - [Reduce Repetition](#reduce-repetition)\n",
|
35 | 35 | "- [Reference Pronunciations](#reference-pronunciations)\n",
|
36 |
| - " - [Alpha-numeric Pronunciations](#alpha-numeric-pronunciations)\n", |
| 36 | + " - [Alphanumeric Pronunciations](#alphanumeric-pronunciations)\n", |
37 | 37 | "- [Instructions](#instructions)\n",
|
38 | 38 | " - [Instruction Following](#instruction-following)\n",
|
39 | 39 | " - [No audio or unclear audio](#no-audio-or-unclear-audio)\n",
|
|
75 | 75 | "- **Iterate relentlessly**: Small wording changes can make or break behavior.\n",
|
76 | 76 | " - Example: For unclear audio instruction, we swapped “inaudible” → “unintelligible” which improved noisy input handling.\n",
|
77 | 77 | "- **Prefer bullets over paragraphs**: Clear, short bullets outperform long paragraphs.\n",
|
78 |
| - "- **Guide with examples**: The model strongly follows onto sample phrases.\n", |
| 78 | + "- **Guide with examples**: The model strongly closely follows sample phrases.\n", |
79 | 79 | "- **Be precise**: Ambiguity or conflicting instructions = degraded performance similar to GPT-5.\n",
|
80 | 80 | "- **Control language**: Pin output to a target language if you see unwanted language switching.\n",
|
81 | 81 | "- **Reduce repetition**: Add a Variety rule to reduce robotic phrasing.\n",
|
|
795 | 795 | "id": "3ac77666",
|
796 | 796 | "metadata": {},
|
797 | 797 | "source": [
|
798 |
| - "## Alpha-numeric Pronunciations\n", |
| 798 | + "## Alphanumeric Pronunciations\n", |
799 | 799 | "Realtime S2S can blur or merge digits/letters when reading back key info (phone, credit card, order IDs). Explicit character-by-character confirmation prevents mishearing and drives clearer synthesis.\n",
|
800 | 800 | "\n",
|
801 | 801 | "- **When to use**: If the model is struggling capturing or reading back phone numbers, card numbers, 2FA codes, order IDs, serials, addresses/unit numbers, or mixed alphanumeric strings.\n",
|
|
915 | 915 | "\n",
|
916 | 916 | "## Do **NOT** list issues of the following types:\n",
|
917 | 917 | "- Invent new instructions, tool calls, or external information. You do not know what tools need to be added that are missing.\n",
|
918 |
| - "- Issues that you are not sure about.\n", |
| 918 | + "- Issues that you are unsure about.\n", |
919 | 919 | "\n",
|
920 | 920 | "\n",
|
921 | 921 | "## Output Format\n",
|
|
924 | 924 | "- Numbered list; include brief quote snippets.\n",
|
925 | 925 | "\n",
|
926 | 926 | "# Improvements\n",
|
927 |
| - "- Numbered list; provide the revised lines you would change and how you would changed them.\n", |
| 927 | + "- Numbered list; provide the revised lines you would change and how you would change them.\n", |
928 | 928 | "\n",
|
929 | 929 | "# Revised Prompt\n",
|
930 | 930 | "- Revised prompt where you have applied all your improvements surgically with minimal edits to the original prompt\n",
|
|
960 | 960 | "id": "e9d05945",
|
961 | 961 | "metadata": {},
|
962 | 962 | "source": [
|
963 |
| - "## No audio or unclear audio\n", |
| 963 | + "## No Audio or Unclear Audio\n", |
964 | 964 | "Sometimes the model thinks it hears something and tries to respond. You can add a custom instruction telling the model on how to behave when it hears unclear audio or user input. Modify the desire behaviour to fit your use case (maybe you don’t want the model to ask for a clarification, but to repeat the same question for example)\n",
|
965 | 965 | "\n",
|
966 | 966 | "- **When to use**: Background noise, partial words, or silence trigger unwanted replies.\n",
|
|
1175 | 1175 | "id": "f46ed3c0",
|
1176 | 1176 | "metadata": {},
|
1177 | 1177 | "source": [
|
1178 |
| - "## Tool Calls without Confirmation\n", |
| 1178 | + "## Tool Calls Without Confirmation\n", |
1179 | 1179 | "Sometimes the model might ask for confirmation before a tool call. For some use cases, this can lead to poor experience for the end user since the model is not being proactive.\n",
|
1180 | 1180 | "\n",
|
1181 | 1181 | "- **When to use**: The agent asks for permission before obvious tool calls.\n",
|
|
1436 | 1436 | "id": "1d4ef67e",
|
1437 | 1437 | "metadata": {},
|
1438 | 1438 | "source": [
|
1439 |
| - "# Conversation flow\n", |
| 1439 | + "# Conversation Flow\n", |
1440 | 1440 | "This section covers how to structure the dialogue into clear, goal-driven phases so the model knows exactly what to do at each step. It defines the purpose of each phase, the instructions for moving through it, and the concrete “exit criteria” for transitioning to the next. This prevents the model from stalling, skipping steps, or jumping ahead, and ensures the conversation stays organized from greeting to resolution.\n",
|
1441 | 1441 | "\n",
|
1442 | 1442 | "As well, by organizing your prompt into various conversation states, it becomes easier to identify error modes and iterate more effectively.\n",
|
|
1512 | 1512 | "id": "0ccf1d53",
|
1513 | 1513 | "metadata": {},
|
1514 | 1514 | "source": [
|
1515 |
| - "## Sample phrases\n", |
| 1515 | + "## Sample Phrases\n", |
1516 | 1516 | "Sample phrases act as “anchor examples” for the model. They show the style, brevity, and tone you want it to follow, without locking it into one rigid response.\n",
|
1517 | 1517 | "\n",
|
1518 | 1518 | "- **When to use**: Responses lack your brand style or are not consistent.\n",
|
|
0 commit comments