|
| 1 | +This document explains how to handle a scenario where a user is on hold while the system attempts to connect them to a specialist. If the specialist does not pick up within X seconds or if the call hits voicemail, we take an alternate action (like playing an announcement or scheduling an appointment). This solution integrates Vapi.ai for AI-driven conversations and Twilio for call bridging. |
| 2 | + |
| 3 | +## Problem |
| 4 | + |
| 5 | +Vapi.ai does not provide a built-in way to keep the user on hold, dial a specialist, and handle cases where the specialist is unavailable. We want: |
| 6 | + |
| 7 | +1. The user already talking to the AI (Vapi). |
| 8 | +2. The AI offers to connect them to a specialist. |
| 9 | +3. The user is placed on hold or in a conference room. |
| 10 | +4. We dial the specialist to join. |
| 11 | +5. If the specialist answers, everyone is merged. |
| 12 | +6. If the specialist does not answer (within X seconds or goes to voicemail), we want to either announce "Specialist not available" or schedule an appointment. |
| 13 | + |
| 14 | +## Solution |
| 15 | + |
| 16 | +1. An inbound call arrives from Vapi or from the user directly. |
| 17 | +2. We store its details (e.g., Twilio CallSid). |
| 18 | +3. We send TwiML (or instructions) to put the user in a Twilio conference (on hold). |
| 19 | +4. We place a second call to the specialist, also directed to join the same conference. |
| 20 | +5. If the specialist picks up, Twilio merges the calls. |
| 21 | +6. If not, we handle the no-answer event by playing a message or returning control to the AI for scheduling. |
| 22 | + |
| 23 | +## Steps to Solve the Problem |
| 24 | + |
| 25 | +1. **Receive Inbound Call** |
| 26 | + |
| 27 | + - Twilio posts data to your `/inbound_call`. |
| 28 | + - You store the call reference. |
| 29 | + - You might also invoke Vapi for initial AI instructions. |
| 30 | + |
| 31 | +2. **Prompt User via Vapi** |
| 32 | + |
| 33 | + - The user decides whether they want the specialist. |
| 34 | + - If yes, you call an endpoint (e.g., `/connect`). |
| 35 | + |
| 36 | +3. **Create/Join Conference** |
| 37 | + |
| 38 | + - In `/connect`, you update the inbound call to go into a conference route. |
| 39 | + - The user is effectively on hold. |
| 40 | + |
| 41 | +4. **Dial Specialist** |
| 42 | + |
| 43 | + - You create a second call leg to the specialist’s phone. |
| 44 | + - A `statusCallback` can detect no-answer or voicemail. |
| 45 | + |
| 46 | +5. **Detect Unanswered** |
| 47 | + |
| 48 | + - If Twilio sees a no-answer or failure, your callback logic plays an announcement or signals the AI to schedule an appointment. |
| 49 | + |
| 50 | +6. **Merge or Exit** |
| 51 | + |
| 52 | + - If the specialist answers, they join the user. |
| 53 | + - If not, the user is taken off hold and the call ends or goes back to AI. |
| 54 | + |
| 55 | +7. **Use Ephemeral Call (Optional)** |
| 56 | + - If you need an in-conference announcement, create a short-lived Twilio call that `<Say>` the message to everyone, then ends the conference. |
| 57 | + |
| 58 | +## Code Example |
| 59 | + |
| 60 | +Below is a minimal Express.js server aligned for On-Hold Specialist Transfer with Vapi and Twilio. |
| 61 | + |
| 62 | +1. **Express Setup and Environment** |
| 63 | + |
| 64 | +```js |
| 65 | +const express = require("express"); |
| 66 | +const bodyParser = require("body-parser"); |
| 67 | +const axios = require("axios"); |
| 68 | +const twilio = require("twilio"); |
| 69 | + |
| 70 | +const app = express(); |
| 71 | +app.use(bodyParser.urlencoded({ extended: true })); |
| 72 | +app.use(bodyParser.json()); |
| 73 | + |
| 74 | +// Load important env vars |
| 75 | +const { |
| 76 | + TWILIO_ACCOUNT_SID, |
| 77 | + TWILIO_AUTH_TOKEN, |
| 78 | + FROM_NUMBER, |
| 79 | + TO_NUMBER, |
| 80 | + VAPI_BASE_URL, |
| 81 | + PHONE_NUMBER_ID, |
| 82 | + ASSISTANT_ID, |
| 83 | + PRIVATE_API_KEY, |
| 84 | +} = process.env; |
| 85 | + |
| 86 | +// Create a Twilio client |
| 87 | +const client = twilio(TWILIO_ACCOUNT_SID, TWILIO_AUTH_TOKEN); |
| 88 | + |
| 89 | +// We'll store the inbound call SID here for simplicity |
| 90 | +let globalCallSid = ""; |
| 91 | +``` |
| 92 | + |
| 93 | +2. **`/inbound_call` - Handling the Inbound Call** |
| 94 | + |
| 95 | +```js |
| 96 | +app.post("/inbound_call", async (req, res) => { |
| 97 | + try { |
| 98 | + globalCallSid = req.body.CallSid; |
| 99 | + const caller = req.body.Caller; |
| 100 | + |
| 101 | + // Example: We call Vapi.ai to get initial TwiML |
| 102 | + const response = await axios.post( |
| 103 | + `${VAPI_BASE_URL || "https://api.vapi.ai"}/call`, |
| 104 | + { |
| 105 | + phoneNumberId: PHONE_NUMBER_ID, |
| 106 | + phoneCallProviderBypassEnabled: true, |
| 107 | + customer: { number: caller }, |
| 108 | + assistantId: ASSISTANT_ID, |
| 109 | + }, |
| 110 | + { |
| 111 | + headers: { |
| 112 | + Authorization: `Bearer ${PRIVATE_API_KEY}`, |
| 113 | + "Content-Type": "application/json", |
| 114 | + }, |
| 115 | + } |
| 116 | + ); |
| 117 | + |
| 118 | + const returnedTwiml = response.data.phoneCallProviderDetails.twiml; |
| 119 | + return res.type("text/xml").send(returnedTwiml); |
| 120 | + } catch (err) { |
| 121 | + return res.status(500).send("Internal Server Error"); |
| 122 | + } |
| 123 | +}); |
| 124 | +``` |
| 125 | + |
| 126 | +3. **`/connect` - Putting User on Hold and Dialing Specialist** |
| 127 | + |
| 128 | +```js |
| 129 | +app.post("/connect", async (req, res) => { |
| 130 | + try { |
| 131 | + const protocol = |
| 132 | + req.headers["x-forwarded-proto"] === "https" ? "https" : "http"; |
| 133 | + const baseUrl = `${protocol}://${req.get("host")}`; |
| 134 | + const conferenceUrl = `${baseUrl}/conference`; |
| 135 | + |
| 136 | + // 1) Update inbound call to fetch TwiML from /conference |
| 137 | + await client.calls(globalCallSid).update({ |
| 138 | + url: conferenceUrl, |
| 139 | + method: "POST", |
| 140 | + }); |
| 141 | + |
| 142 | + // 2) Dial the specialist |
| 143 | + const statusCallbackUrl = `${baseUrl}/participant-status`; |
| 144 | + |
| 145 | + await client.calls.create({ |
| 146 | + to: TO_NUMBER, |
| 147 | + from: FROM_NUMBER, |
| 148 | + url: conferenceUrl, |
| 149 | + method: "POST", |
| 150 | + statusCallback: statusCallbackUrl, |
| 151 | + statusCallbackMethod: "POST", |
| 152 | + }); |
| 153 | + |
| 154 | + return res.json({ status: "Specialist call initiated" }); |
| 155 | + } catch (err) { |
| 156 | + return res.status(500).json({ error: "Failed to connect specialist" }); |
| 157 | + } |
| 158 | +}); |
| 159 | +``` |
| 160 | + |
| 161 | +4. **`/conference` - Placing Callers Into a Conference** |
| 162 | + |
| 163 | +```js |
| 164 | +app.post("/conference", (req, res) => { |
| 165 | + const VoiceResponse = twilio.twiml.VoiceResponse; |
| 166 | + const twiml = new VoiceResponse(); |
| 167 | + |
| 168 | + // Put the caller(s) into a conference |
| 169 | + const dial = twiml.dial(); |
| 170 | + dial.conference( |
| 171 | + { |
| 172 | + startConferenceOnEnter: true, |
| 173 | + endConferenceOnExit: true, |
| 174 | + }, |
| 175 | + "my_conference_room" |
| 176 | + ); |
| 177 | + |
| 178 | + return res.type("text/xml").send(twiml.toString()); |
| 179 | +}); |
| 180 | +``` |
| 181 | + |
| 182 | +5. **`/participant-status` - Handling No-Answer or Busy** |
| 183 | + |
| 184 | +```js |
| 185 | +app.post("/participant-status", async (req, res) => { |
| 186 | + const callStatus = req.body.CallStatus; |
| 187 | + if (["no-answer", "busy", "failed"].includes(callStatus)) { |
| 188 | + console.log("Specialist did not pick up:", callStatus); |
| 189 | + // Additional logic: schedule an appointment, ephemeral call, etc. |
| 190 | + } |
| 191 | + return res.sendStatus(200); |
| 192 | +}); |
| 193 | +``` |
| 194 | + |
| 195 | +6. **`/announce` (Optional) - Ephemeral Announcement** |
| 196 | + |
| 197 | +```js |
| 198 | +app.post("/announce", (req, res) => { |
| 199 | + const VoiceResponse = twilio.twiml.VoiceResponse; |
| 200 | + const twiml = new VoiceResponse(); |
| 201 | + twiml.say("Specialist is not available. Ending call now."); |
| 202 | + |
| 203 | + // Join the conference, then end it. |
| 204 | + twiml.dial().conference( |
| 205 | + { |
| 206 | + startConferenceOnEnter: true, |
| 207 | + endConferenceOnExit: true, |
| 208 | + }, |
| 209 | + "my_conference_room" |
| 210 | + ); |
| 211 | + |
| 212 | + return res.type("text/xml").send(twiml.toString()); |
| 213 | +}); |
| 214 | +``` |
| 215 | + |
| 216 | +7. **Starting the Server** |
| 217 | + |
| 218 | +```js |
| 219 | +app.listen(3000, () => { |
| 220 | + console.log("Server running on port 3000"); |
| 221 | +}); |
| 222 | +``` |
| 223 | + |
| 224 | +## How to Test |
| 225 | + |
| 226 | +1. **Environment Variables** |
| 227 | + Set `TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN`, `FROM_NUMBER`, `TO_NUMBER`, `VAPI_BASE_URL`, `PHONE_NUMBER_ID`, `ASSISTANT_ID`, and `PRIVATE_API_KEY`. |
| 228 | + |
| 229 | +2. **Expose Your Server** |
| 230 | + |
| 231 | + - Use a tool like `ngrok` to create a public URL to port 3000. |
| 232 | + - Configure your Twilio phone number to call `/inbound_call` when a call comes in. |
| 233 | + |
| 234 | +3. **Place a Real Call** |
| 235 | + |
| 236 | + - Dial your Twilio number from a phone. |
| 237 | + - Twilio hits `/inbound_call`, and run Vapi logic. |
| 238 | + - Trigger `/connect` to conference the user and dial the specialist. |
| 239 | + - If the specialist answers, they join the same conference. |
| 240 | + - If they never answer, Twilio eventually calls `/participant-status`. |
| 241 | + |
| 242 | +4. **Use cURL for Testing** |
| 243 | + - **Simulate Inbound**: |
| 244 | + ```bash |
| 245 | + curl -X POST https://<public-url>/inbound_call \ |
| 246 | + -F "CallSid=CA12345" \ |
| 247 | + -F "Caller=+15551112222" |
| 248 | + ``` |
| 249 | + - **Connect**: |
| 250 | + ```bash |
| 251 | + curl -X POST https://<public-url>/connect \ |
| 252 | + -H "Content-Type: application/json" \ |
| 253 | + -d "{}" |
| 254 | + ``` |
| 255 | + |
| 256 | +## Note on Replacing "Connect" with Vapi Tools |
| 257 | + |
| 258 | +Vapi offers built-in functions or custom tool calls for placing a second call or transferring, you can replace the manual `/connect` call with that Vapi functionality. The flow remains the same: user is put in a Twilio conference, the specialist is dialed, and any no-answer events are handled. |
| 259 | + |
| 260 | +## Notes & Limitations |
| 261 | + |
| 262 | +1. **Voicemail** |
| 263 | + If a phone’s voicemail picks up, Twilio sees it as answered. Consider advanced detection or a fallback. |
| 264 | + |
| 265 | +2. **Concurrent Calls** |
| 266 | + Multiple calls at once require storing separate `CallSid`s or similar references. |
| 267 | + |
| 268 | +3. **Conference Behavior** |
| 269 | + `startConferenceOnEnter: true` merges participants immediately; `endConferenceOnExit: true` ends the conference when that participant leaves. |
| 270 | + |
| 271 | +4. **X Seconds** |
| 272 | + Decide how you detect no-answer. Typically, Twilio sets a final `callStatus` if the remote side never picks up. |
| 273 | + |
| 274 | +With these steps and code, you can integrate Vapi Assistant while using Twilio’s conferencing features to hold, dial out to a specialist, and handle an unanswered or unavailable specialist scenario. |
0 commit comments