Update Realtime Prompting Guide and registry.yaml

minh-hoque · minh-hoque · commit 5d3c5bd0e8fa · 2025-08-18T00:16:19.000-04:00
- Modified the Realtime prompting guide to improve the flow of content and enhance clarity.
- Updated the tags in registry.yaml from 'prompt' to 'speech' to better reflect the guide's focus on speech-based interactions.
- Added new instructions regarding the `speed` parameter in the Realtime API to clarify its impact on playback rate.
diff --git a/examples/Realtime_prompting_guide.ipynb b/examples/Realtime_prompting_guide.ipynb
@@ -20,10 +20,7 @@
     "\n",
     "The new gpt-4o-realtime-08-11 model delivers stronger instruction following, more reliable tool calling, noticeably better voice quality, and an overall smoother feel. These gains make it practical to move from chained approaches to true realtime experiences, cutting latency while keeping a consistent conversational tone.\n",
     "\n",
-    "Realtime model benefit from different prompting techniques that wouldn't directly apply to text based models. You’ll get better results by tuning prompts for speech—pacing, turn-taking, and style and iterating quickly.\n",
-    "\n",
-    "\n",
-    "This prompting guide starts with a simple prompt skeleton, then walks through each part with practical tips, small patterns you can copy, and examples you can adapt to your use case.\n",
+    "Realtime model benefit from different prompting techniques that wouldn't directly apply to text based models. This prompting guide starts with a simple prompt skeleton, then walks through each part with practical tips, small patterns you can copy, and examples you can adapt to your use case.\n",
     "\n",
     "# Table of Contents\n",
     "\n",
@@ -44,8 +41,8 @@
     "  - [Tool Call Preambles](#tool-call-preambles)\n",
     "    - [Tool Call Preambles + Sample Phrases](#tool-call-preambles-sample-phrases)\n",
     "  - [Tool Calls without Confirmation](#tool-calls-without-confirmation)\n",
-    "  - [Tool Level Behavior](#tool-level-behavior)\n",
     "  - [Tool Call Performance](#tool-call-performance)\n",
+    "  - [Tool Level Behavior](#tool-level-behavior)\n",
     "  - [Rephrase Supervisor Tool (Responder-Thinker Architecture)](#rephrase-supervisor-tool-responder-thinker-architecture)\n",
     "  - [Common Tools](#common-tools)\n",
     "- [Conversation flow](#conversation-flow)\n",
@@ -394,6 +391,7 @@
    "metadata": {},
    "source": [
     "## Speed Instructions\n",
+    "In the Realtime API, the `speed` parameter changes playback rate, not how the model composes speech. To actually sound faster, add instructions that can guide the pacing.\n",
     "\n",
     "- **When to use**: Users want faster speaking voice; playback speed (with speed parameter) alone doesn’t fix speaking style.\n",
     "- **What it does**: Tunes speaking style (brevity, cadence) independent of client playback speed.\n",
@@ -893,17 +891,20 @@
     "You are a **Prompt-Critique Expert**.\n",
     "Examine a user-supplied LLM prompt and surface any weaknesses following the instructions below.\n",
     "\n",
+    "\n",
     "## Instructions\n",
     "Review the prompt that is meant for an LLM to follow and identify the following issues:\n",
     "- Ambiguity: Could any wording be interpreted in more than one way?\n",
     "- Lacking Definitions: Are there any class labels, terms, or concepts that are not defined that might be misinterpreted by an LLM?\n",
     "- Conflicting, missing, or vague instructions: Are directions incomplete or contradictory?\n",
     "- Unstated assumptions: Does the prompt assume the model has to be able to do something that is not explicitly stated?\n",
     "\n",
+    "\n",
     "## Do **NOT** list issues of the following types:\n",
     "- Invent new instructions, tool calls, or external information. You do not know what tools need to be added that are missing.\n",
     "- Issues that you are not sure about.\n",
     "\n",
+    "\n",
     "## Output Format\n",
     "\"\"\"\n",
     "# Issues\n",
@@ -941,11 +942,13 @@
     "# Instructions/Rules\n",
     "...\n",
     "\n",
+    "\n",
     "## Unclear audio\n",
     "- Always respond in the same language the user is speaking in, if intelligible. (optional)\n",
     "- Only respond to clear speech or text.\n",
     "- If the user's audio is not clear (e.g. background noise/inaudible/silent/unintelligible) or if you did not fully hear or understand the user, ask for clarification using English phrases such as “I didn’t catch that—mind repeating?”. Vary the phrases.'\n",
     "\n",
+    "\n",
     "## Preferred Response Language\n",
     "- Always respond in the same language the user is speaking. This is the preferred language for the session.\n",
     "- If you cannot clearly determine the user's language (e.g., due to background noise, inaudible, silent, unintelligible, or ambiguous input), do not guess or fabricate. Instead, politely ask the user to repeat or clarify.\n",
@@ -987,7 +990,7 @@
    "id": "762c8ced",
    "metadata": {},
    "source": [
-    "Don't mind my intense coughing... however, the model was able to correctly ask for clarifications both times."
+    "In this example, the model asks for clarification after my *(very)* loud cought and unclear audio."
    ]
   },
   {
@@ -1022,6 +1025,7 @@
     "## lookup_account(email_or_phone)\n",
     "...\n",
     "\n",
+    "\n",
     "## check_outage(address)\n",
     "...\n",
     "```\n",
@@ -1181,6 +1185,53 @@
     "*Note: If you notice the model disregarding instructions or constraints about your tool call, there’s a chance this might be due to the fact you are asking it to be very proactive.*"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "dff39254",
+   "metadata": {},
+   "source": [
+    "## Tool Call Performance\n",
+    "As use cases grow more complex and the number of available tools increases, it becomes critical to explicitly guide the model on when to use each tool and just as importantly, when not to. Clear usage rules not only improve tool call accuracy but also help the model choose the right tool at the right time.\n",
+    "\n",
+    "- **When to use**: Model is struggling with tool call performance and needs the instructions to be explicit to reduce misuse.\n",
+    "- **What it does**: Add instructions on when to “use/avoid” each tool. You can also add instructions on sequences of tool calls (after Tool call A, you can call Tool call B or C)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "37a8b788",
+   "metadata": {},
+   "source": [
+    "### Example\n",
+    "```\n",
+    "# Tools\n",
+    "- When you call any tools, you must output at the same time a response letting the user know that you are calling the tool.\n",
+    "\n",
+    "## lookup_account(email_or_phone)\n",
+    "Use when: verifying identity or viewing plan/outage flags.\n",
+    "Do NOT use when: the user is clearly anonymous and only asks general questions.\n",
+    "\n",
+    "\n",
+    "## check_outage(address)\n",
+    "Use when: user reports connectivity issues or slow speeds.\n",
+    "Do NOT use when: question is billing-only.\n",
+    "\n",
+    "\n",
+    "## refund_credit(account_id, minutes)\n",
+    "Use when: confirmed outage > 240 minutes in the past 7 days.\n",
+    "Do NOT use when: outage is unconfirmed; route to Diagnose → check_outage first.\n",
+    "\n",
+    "\n",
+    "## schedule_technician(account_id, window)\n",
+    "Use when: repeated failures after reboot and outage status = false.\n",
+    "Do NOT use when: outage status = true (send status + ETA instead).\n",
+    "\n",
+    "\n",
+    "## escalate_to_human(account_id, reason)\n",
+    "Use when: user seems very frustrated, abuse/harassment, repeated failures, billing disputes >$50, or user requests escalation.\n",
+    "```"
+   ]
+  },
   {
    "cell_type": "markdown",
    "id": "edebafe2",
@@ -1205,75 +1256,37 @@
     "- For the tools marked as CONFIRMATION FIRST: always ask for confirmation to the user.\n",
     "- For the tools marked as PREAMBLES: Before any tool call, say one short line like “I’m checking that now.” Then call the tool immediately.\n",
     "\n",
+    "\n",
     "## lookup_account(email_or_phone) — PROACTIVE\n",
     "Use when: verifying identity or accessing billing.  \n",
     "Rules: If the caller refuses to identify, ask one more time in an angry manner.  \n",
     "Do NOT use when: caller refuses to identify after second request.\n",
     "\n",
+    "\n",
     "## check_outage(address) — PREAMBLES\n",
     "Use when: caller reports failed connection or speed lower than 10 Mbps.  \n",
     "Do NOT use when: purely billing OR when internet speed is above 10 Mbps.  \n",
     "If either condition applies, inform the customer you cannot assist and hang up.\n",
     "\n",
+    "\n",
     "## refund_credit(account_id, minutes) — CONFIRMATION FIRST\n",
     "Use when: confirmed outage > 240 minutes in the past 7 days (credit 60 minutes).  \n",
     "Do NOT use when: outage unconfirmed.  \n",
     "Confirmation phrase: “I can issue a credit for this outage—would you like me to go ahead?”\n",
     "\n",
+    "\n",
     "## schedule_technician(account_id, window) — CONFIRMATION FIRST\n",
     "Use when: reboot + line checks fail AND outage=false.  \n",
     "Windows: “10am–12pm ET” or “2pm–4pm ET”.  \n",
     "Confirmation phrase: “I can schedule a technician to visit—should I book that for you?”\n",
     "\n",
+    "\n",
     "## escalate_to_human(account_id, reason) — PREAMBLES\n",
     "Use when: harassment, threats, self-harm, repeated failure, billing disputes > $50, caller is frustrated, or caller requests escalation.  \n",
     "Preamble: “Let me connect you to a senior agent who can assist further.”\n",
     "```"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "id": "dff39254",
-   "metadata": {},
-   "source": [
-    "## Tool Call Performance\n",
-    "As use cases grow more complex and the number of available tools increases, it becomes critical to explicitly guide the model on when to use each tool and just as importantly, when not to. Clear usage rules not only improve tool call accuracy but also help the model choose the right tool at the right time.\n",
-    "\n",
-    "- **When to use**: Model is struggling with tool call performance and needs the instructions to be explicit to reduce misuse.\n",
-    "- **What it does**: Add instructions on when to “use/avoid” each tool. You can also add instructions on sequences of tool calls (after Tool call A, you can call Tool call B or C)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "37a8b788",
-   "metadata": {},
-   "source": [
-    "### Example\n",
-    "```\n",
-    "# Tools\n",
-    "- When you call any tools, you must output at the same time a response letting the user know that you are calling the tool.\n",
-    "\n",
-    "## lookup_account(email_or_phone)\n",
-    "Use when: verifying identity or viewing plan/outage flags.\n",
-    "Do NOT use when: the user is clearly anonymous and only asks general questions.\n",
-    "\n",
-    "## check_outage(address)\n",
-    "Use when: user reports connectivity issues or slow speeds.\n",
-    "Do NOT use when: question is billing-only.\n",
-    "\n",
-    "## refund_credit(account_id, minutes)\n",
-    "Use when: confirmed outage > 240 minutes in the past 7 days.\n",
-    "Do NOT use when: outage is unconfirmed; route to Diagnose → check_outage first.\n",
-    "\n",
-    "## schedule_technician(account_id, window)\n",
-    "Use when: repeated failures after reboot and outage status = false.\n",
-    "Do NOT use when: outage status = true (send status + ETA instead).\n",
-    "\n",
-    "## escalate_to_human(account_id, reason)\n",
-    "Use when: user seems very frustrated, abuse/harassment, repeated failures, billing disputes >$50, or user requests escalation.\n",
-    "```"
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "65d35498",
@@ -1299,11 +1312,13 @@
     "## Supervisor Tool\n",
     "Name: getNextResponseFromSupervisor(relevantContextFromLastUserMessage: string)\n",
     "\n",
+    "\n",
     "When to call:\n",
     "- Any request outside the allow list.\n",
     "- Any factual, policy, account, or process question.\n",
     "- Any action that might require internal lookups or system changes.\n",
     "\n",
+    "\n",
     "When not to call:\n",
     "- Simple greetings and basic chitchat.\n",
     "- Requests to repeat or clarify.\n",
@@ -1312,12 +1327,14 @@
     "  - zip_code for store lookup (findNearestStore)\n",
     "  - topic or keyword for policy lookup (lookupPolicyDocument)\n",
     "\n",
+    "\n",
     "Usage rules and preamble:\n",
     "1) Say a neutral filler phrase to the user, then immediately call the tool. Approved fillers: “One moment.”, “Let me check.”, “Just a second.”, “Give me a moment.”, “Let me see.”, “Let me look into that.” Fillers must not imply success or failure.  \n",
     "2) Do not mention the “Supervisor” when responding with filler phrase.\n",
     "3) relevantContextFromLastUserMessage is a one-line summary of the latest user message; use an empty string if nothing salient.  \n",
     "4) After the tool returns, apply Rephrase Supervisor and send your reply.\n",
     "\n",
+    "\n",
     "### Rephrase Supervisor\n",
     "- Start with a brief conversational opener using active language, then flow into the answer (for example: “Thanks for waiting—”, “Just finished checking that.”, “I’ve got that pulled up now.”).  \n",
     "- Keep it short: no more than 2 sentences.  \n",
@@ -1365,9 +1382,11 @@
     "# answer(question: string)\n",
     "Description: Call this when the customer asks a question that you don't have an answer to or asks to perform an action.\n",
     "\n",
+    "\n",
     "# escalate_to_human()\n",
     "Description: Call this when a customer asks for escalation, or to talk to someone else, or expresses dissatisfaction with the call.\n",
     "\n",
+    "\n",
     "# finish_session()\n",
     "Description: Call this when a customer says they're done with the session or doesn't want to continue. If it's ambiguous, confirm with the customer before calling. \n",
     "```"
@@ -1404,6 +1423,7 @@
     "- Confirm that customer is a Northloop customer\n",
     "Exit to Discovery: Caller states they are a Northloop customer and mentions an initial goal or symptom.\n",
     "\n",
+    "\n",
     "## 2) Discover\n",
     "Goal: Classify the issue and capture minimal details.\n",
     "How to respond:\n",
@@ -1412,13 +1432,15 @@
     "- For billing/account: collect email or phone used on the account.\n",
     "Exit when: Intent and address (for connectivity) or email/phone (for billing) are known.\n",
     "\n",
+    "\n",
     "## 3) Verify\n",
     "Goal: Confirm identity and retrieve the account.\n",
     "How to respond:\n",
     "- Once you have email or phone, call lookup_account(email_or_phone).\n",
     "- If lookup fails, try the alternate identifier once; otherwise proceed with general guidance or offer escalation if account actions are required.\n",
     "Exit when: Account ID is returned.\n",
     "\n",
+    "\n",
     "## 4) Diagnose\n",
     "Goal: Decide outage vs local issue.\n",
     "How to respond:\n",
@@ -1427,6 +1449,7 @@
     "- If outage=false, guide a short reboot/cabling check; confirm each step’s result before continuing.\n",
     "Exit when: Root cause known.\n",
     "\n",
+    "\n",
     "## 5) Resolve\n",
     "Goal: Apply fix, credit, or appointment.\n",
     "How to respond:\n",
@@ -1435,6 +1458,7 @@
     "- If the local fix worked, state the result and next steps briefly.\n",
     "Exit when: A fix/credit/appointment has been applied and acknowledged by the caller.\n",
     "\n",
+    "\n",
     "## 6) Confirm/Close\n",
     "Goal: Confirm outcome and end cleanly.\n",
     "How to respond:\n",
@@ -1510,6 +1534,7 @@
     "- “Hi there—tell me what you’d like help with.”\n",
     "Exit when: Caller states an initial goal or symptom.\n",
     "\n",
+    "\n",
     "## 2) Discover\n",
     "Goal: Classify the issue and capture minimal details.\n",
     "How to respond:\n",
@@ -1522,6 +1547,7 @@
     "- “What’s the email or phone number on the account?”\n",
     "Exit when: Intent and address (for connectivity) or email/phone (for billing) are known.\n",
     "\n",
+    "\n",
     "## 3) Verify\n",
     "Goal: Confirm identity and retrieve the account.\n",
     "How to respond:\n",
@@ -1533,6 +1559,7 @@
     "- “Found your account. I’ll take care of this.”\n",
     "Exit when: Account ID is returned.\n",
     "\n",
+    "\n",
     "## 4) Diagnose\n",
     "Goal: Decide outage vs local issue.\n",
     "How to respond:\n",
@@ -1545,6 +1572,7 @@
     "- “Please confirm the modem lights: is the internet light solid or blinking?”\n",
     "Exit when: Root cause known.\n",
     "\n",
+    "\n",
     "## 5) Resolve\n",
     "Goal: Apply fix, credit, or appointment.\n",
     "How to respond:\n",
@@ -1557,6 +1585,7 @@
     "- “Credit applied—you’ll see it on your next bill.”\n",
     "Exit when: A fix/credit/appointment has been applied and acknowledged by the caller.\n",
     "\n",
+    "\n",
     "## 6) Confirm/Close\n",
     "Goal: Confirm outcome and end cleanly.\n",
     "How to respond:\n",
@@ -1802,11 +1831,14 @@
     "- **2** failed tool attempts on the same task **or** **3** consecutive no-match/no-input events\n",
     "- Out-of-scope or restricted (e.g., real-time news, financial/legal/medical advice)\n",
     "\n",
+    "\n",
     "**Examples of what to say (Mandatory phrase before handoff):**\n",
     "- “I'm sorry for the trouble — I'm transferring you to a specialist now. **.”\n",
     "\n",
+    "\n",
     "**Then call the tool:** `escalate_to_human`\n",
     "\n",
+    "\n",
     "Examples that would require escalation:\n",
     "- “This is the third time the reset didn’t work. Just get me a person.”\n",
     "- “I am extremely frustrated!”\n",
diff --git a/registry.yaml b/registry.yaml
@@ -11,7 +11,7 @@
     - minh-hoque
   tags:
     - realtime
-    - prompt
+    - speech
     - audio
     - responses