|
| 1 | +--- |
| 2 | +title: "[Beta] External Actions" |
| 3 | +description: "Have your agent communicate with an External API" |
| 4 | +--- |
| 5 | + |
| 6 | +External Actions allow Vocode agents to take actions outside the realm of a phone call. In particular, Vocode agents can decide to _push_ information to external systems via an API request, and _pull_ information from the API response in order to: |
| 7 | + |
| 8 | +1. change the agent’s behavior based on the pulled information |
| 9 | +2. give the agent context to inform the rest of the phone call |
| 10 | + |
| 11 | +## How it Works |
| 12 | + |
| 13 | +### Configuring the External Action |
| 14 | + |
| 15 | +The Vocode Agent will determine after each turn of conversation if its the ideal time to interact with the External API based primarily on the configured External Action's `description` and `input_schema`! |
| 16 | + |
| 17 | +#### `input_schema` Field |
| 18 | + |
| 19 | +The `input_schema` field is a [JSON Schema](https://json-schema.org/) object that instructs how to properly form a payload to send to the External API. |
| 20 | + |
| 21 | +For example, in the [Meeting Assistant Example](/external-actions#meeting-assistant-example) below we formed the following JSON payload: |
| 22 | + |
| 23 | +```json |
| 24 | +{ |
| 25 | + "type": "object", |
| 26 | + "properties": { |
| 27 | + "length": { |
| 28 | + "type": "string", |
| 29 | + "enum": ["30m", "1hr"] |
| 30 | + }, |
| 31 | + "time": { |
| 32 | + "type": "string", |
| 33 | + "pattern": "^d{2}:d0[ap]m$" |
| 34 | + } |
| 35 | + } |
| 36 | +} |
| 37 | +``` |
| 38 | + |
| 39 | +This is stating the External API is expecting: |
| 40 | + |
| 41 | +- Two fields |
| 42 | + - `length` (string): either "30m" or "1hr" |
| 43 | + - `time` (string): a regex pattern defining a time ending in a zero with `am`/`pm` on the end ie: `10:30am` |
| 44 | + |
| 45 | +<Card title="💡 Note" color="#ca8b04"> |
| 46 | + If you’re noticing that this looks very familiar to OpenAI function calling, it is! The Vocode API treats OpenAI LLMs as first-class and uses the function calling API when the agent uses an OpenAI LLM. |
| 47 | + |
| 48 | + The lone difference is that the top level `input_schema` JSON schema must be an `object` - this is so we can use JSON to send over parameters to the user’s API. |
| 49 | + |
| 50 | +</Card> |
| 51 | + |
| 52 | +#### `description` Field |
| 53 | + |
| 54 | +The `description` is best used to descibe your External Action's purpose. As its passed through directly to the LLM, its the best way to convey instructions to the underlying Vocode Agent. |
| 55 | + |
| 56 | +For example, in the [Meeting Assistant Example](/external-actions#meeting-assistant-example) below we want to book a meeting for 30 minutes to an hour so we set the description as `Book a meeting for a 30 minute or 1 hour call.` |
| 57 | + |
| 58 | +<Card title="💡 Note" color="#ca8b04"> |
| 59 | + The `description` field is passed through and heavily affects how we do our |
| 60 | + function decisioning so we recommend treating it in the same way you would a |
| 61 | + prompt to an LLM! |
| 62 | +</Card> |
| 63 | + |
| 64 | +#### Other Fields to Determine Agent Behavior |
| 65 | + |
| 66 | +- `speak_on_send`: if `True`, then the underlying LLM will generate a message to be spoken into the phone call as the |
| 67 | + API request is being sent. - `url`: The API request is sent to this URL in the format |
| 68 | + defined below in [Responding to External Action API Requests](/external-actions#responding-to-external-action-api-requests) |
| 69 | + |
| 70 | +- `speak_on_receive`: if `True`, then the Vocode Agent will invoke the underlying |
| 71 | + LLM to respond based on the result from the API Response or the Error encountered. |
| 72 | + |
| 73 | +### Responding to External Action API Requests |
| 74 | + |
| 75 | +Once an External Action has been created, the Vocode Agent will issue API requests to the defined `url` during the course of a phone call based on the [configuration noted above](/external-actions#configuring-the-external-action) |
| 76 | +The Vocode API will wait a maximum of _10 seconds_ before timing out the request. |
| 77 | + |
| 78 | +In particular, Vocode will issue a POST request to `url` with a JSON payload that matches `input_schema` , specifically (using the [Meeting Assistant Example](/external-actions#meeting-assistant-example) below): |
| 79 | + |
| 80 | +```bash |
| 81 | +POST url HTTP/1.1 |
| 82 | +Accept: application/json |
| 83 | +Content-Type: application/json |
| 84 | +x-vocode-signature: <encoded_signature> |
| 85 | + |
| 86 | +{ |
| 87 | + "call_id": <UUID>, |
| 88 | + "payload": { |
| 89 | + "length": "30m", |
| 90 | + "time": "10:30am" |
| 91 | + } |
| 92 | +} |
| 93 | +``` |
| 94 | + |
| 95 | +#### Signature Validation |
| 96 | + |
| 97 | +A cryptographically signed signature of the request body and a randomly generated byte hash in included as a header (under `x-vocode-signature`) in the outbound request so that the user’s API can validate the identity of the incoming request. |
| 98 | + |
| 99 | +The signature secret is contained in the External Action's API object and can be found when creating an object (as noted below in the [Meeting Assistant Example](/external-actions#meeting-assistant-example)), or by getting the API object via the `/v1/actions?id=ACTION_ID` endpoint: |
| 100 | + |
| 101 | +<CodeGroup> |
| 102 | +```bash Example cURL Request |
| 103 | +curl --request GET \ |
| 104 | + --url https://api.vocode.dev/v1/actions?id=<EXAMPLE_ACTION_ID>\ |
| 105 | + --header 'Content-Type: application/json' \ |
| 106 | + --header 'Authorization: Bearer <API_KEY>' |
| 107 | +``` |
| 108 | + |
| 109 | +```json Response |
| 110 | +{ |
| 111 | + "id": "<EXAMPLE_ACTION_ID>", |
| 112 | + "user_id": "ecd792cf-18a2-420b-91f5-cdaf22f5f562", |
| 113 | + "type": "action_external", |
| 114 | + "config": { |
| 115 | + "processing_mode": "muted", |
| 116 | + "name": "Meeting_Booking_Assistant", |
| 117 | + "description": "Book a meeting for a 30 minute or 1 hour call.", |
| 118 | + "url": "http://example.com/booking", |
| 119 | + "input_schema": { |
| 120 | + "type": "object", |
| 121 | + "properties": { |
| 122 | + "length": { |
| 123 | + "type": "string", |
| 124 | + "enum": ["30m", "1hr"] |
| 125 | + }, |
| 126 | + "time": { |
| 127 | + "type": "string", |
| 128 | + "pattern": "^\\d{2}:\\d0[ap]m$" |
| 129 | + } |
| 130 | + } |
| 131 | + }, |
| 132 | + "speak_on_send": true, |
| 133 | + "speak_on_receive": true, |
| 134 | + "signature_secret": "MX/9/+iblnUoAAM2Jft8sgeY1bevJvuih2nr7XKPHIY=" |
| 135 | + }, |
| 136 | + "action_trigger": { |
| 137 | + "type": "action_trigger_function_call", |
| 138 | + "config": {} |
| 139 | + } |
| 140 | +} |
| 141 | +``` |
| 142 | + |
| 143 | +</CodeGroup> |
| 144 | +Use the following code snippet to check the signature in an inbound request: |
| 145 | + |
| 146 | +<CodeGroup> |
| 147 | +```python Python |
| 148 | +import base64 |
| 149 | +import hashlib |
| 150 | +import hmac |
| 151 | + |
| 152 | +async def test_requester_encodes_signature( |
| 153 | +request_signature_value: str, signature_secret: str, payload: dict |
| 154 | +): |
| 155 | +""" |
| 156 | +Asynchronous function to check if the request signature is encoded correctly. |
| 157 | +
|
| 158 | + Args: |
| 159 | + request_signature_value (str): The request signature to be decoded. |
| 160 | + signature_secret (str): The signature to be decoded and used for comparison. |
| 161 | + payload (dict): The payload to be used for digest calculation. |
| 162 | +
|
| 163 | + Returns: |
| 164 | + None |
| 165 | + """ |
| 166 | + signature_secret_as_bytes = base64.b64decode(signature_secret) |
| 167 | + decoded_digest = base64.b64decode(request_signature_value) |
| 168 | + calculated_digest = hmac.new(signature_secret_as_bytes, payload, hashlib.sha256).digest() |
| 169 | + assert hmac.compare_digest(decoded_digest, calculated_digest) is True |
| 170 | + |
| 171 | +```` |
| 172 | + |
| 173 | +```typescript TypeScript |
| 174 | +import * as crypto from 'crypto'; |
| 175 | + |
| 176 | +async function testRequesterEncodesSignature( |
| 177 | + requestSignatureValue: string, |
| 178 | + signatureSecret: string, |
| 179 | + payload: Record<string, unknown> |
| 180 | +): Promise<void> { |
| 181 | + /** |
| 182 | + * Asynchronous function to check if the request signature is encoded correctly. |
| 183 | + * |
| 184 | + * @param requestSignatureValue - The request signature to be decoded. |
| 185 | + * @param signatureSecret - The signature to be decoded and used for comparison. |
| 186 | + * @param payload - The payload to be used for digest calculation. |
| 187 | + */ |
| 188 | + const signatureAsBytes = Buffer.from(signatureSecret, 'base64'); |
| 189 | + const decodedDigest = Buffer.from(requestSignatureValue, 'base64'); |
| 190 | + const payloadString = JSON.stringify(payload); |
| 191 | + const calculatedDigest = crypto |
| 192 | + .createHmac('sha256', signatureAsBytes) |
| 193 | + .update(payloadString) |
| 194 | + .digest(); |
| 195 | + |
| 196 | + if (!crypto.timingSafeEqual(decodedDigest, calculatedDigest)) { |
| 197 | + throw new Error('Signature mismatch'); |
| 198 | + } |
| 199 | +} |
| 200 | +```` |
| 201 | + |
| 202 | +</CodeGroup> |
| 203 | + |
| 204 | +#### Response Formatting |
| 205 | + |
| 206 | +Vocode expects responses from the user’s API in JSON in the following format: |
| 207 | + |
| 208 | +```python |
| 209 | +Response { |
| 210 | + result: Any |
| 211 | + agent_message: Optional[str] = None |
| 212 | +} |
| 213 | +``` |
| 214 | + |
| 215 | +- `result` is a payload containing the result of the action on the user’s side, and can be in any format |
| 216 | +- `agent_message` optionally contains a message that will be synthesized into audio and sent back to the phone call (see [Configuring the External Action](/external-actions#configuring-the-external-action) above for more info) |
| 217 | + |
| 218 | +In the [Meeting Assistant Example](/external-actions#meeting-assistant-example) below, the user’s API could return back a JSON response that looks like: |
| 219 | + |
| 220 | +```json |
| 221 | +{ |
| 222 | + "result": { |
| 223 | + "success": true |
| 224 | + }, |
| 225 | + "agent_message": "I've set up a calendar appointment at 10:30am tomorrow for 30 minutes" |
| 226 | +} |
| 227 | +``` |
| 228 | + |
| 229 | +## Meeting Assistant Example: |
| 230 | + |
| 231 | +This is an example of a Meeting Assistant which will attempt to book a meeting for 30 minutes or an hour at any time ending in a zero (ie 10:30am is okay but 10:35am is not) |
| 232 | + |
| 233 | +<CodeGroup> |
| 234 | + |
| 235 | +```python Python |
| 236 | +vocode_client.actions.create_action( |
| 237 | + request={ |
| 238 | + "type": "action_external", |
| 239 | + "config": { |
| 240 | + "name": "Meeting_Booking_Assistant", |
| 241 | + "description": ("Book a meeting for a 30 minute or 1 hour call."), |
| 242 | + "url": "http://example.com/booking", |
| 243 | + "speak_on_send": True, |
| 244 | + "speak_on_receive": True, |
| 245 | + "input_schema": { |
| 246 | + "type": "object", |
| 247 | + "properties": { |
| 248 | + "length": { |
| 249 | + "type": "string", |
| 250 | + "enum": ["30m", "1hr"], |
| 251 | + }, |
| 252 | + "time": { |
| 253 | + "type": "string", |
| 254 | + "pattern": "^\d{2}:\d0[ap]m$", |
| 255 | + }, |
| 256 | + }, |
| 257 | + }, |
| 258 | + }, |
| 259 | + }, |
| 260 | +) |
| 261 | +``` |
| 262 | + |
| 263 | +```typescript TypeScript |
| 264 | +const action = await vocode.actions.createAction({ |
| 265 | + type: "action_external", |
| 266 | + config: { |
| 267 | + name: "Meeting_Booking_Assistant", |
| 268 | + description: "Book a meeting for a 30 minute or 1 hour call.", |
| 269 | + url: "http://example.com/booking", |
| 270 | + speak_on_send: true, |
| 271 | + speak_on_receive: true, |
| 272 | + input_schema: { |
| 273 | + type: "object", |
| 274 | + properties: { |
| 275 | + length: { |
| 276 | + type: "string", |
| 277 | + enum: ["30m", "1hr"], |
| 278 | + }, |
| 279 | + time: { |
| 280 | + type: "string", |
| 281 | + pattern: "^\\d{2}:\\d0[ap]m$", |
| 282 | + }, |
| 283 | + }, |
| 284 | + }, |
| 285 | + }, |
| 286 | +}); |
| 287 | +``` |
| 288 | + |
| 289 | +```bash cURL |
| 290 | +curl --request POST \ |
| 291 | + --url https://api.vocode.dev/v1/actions/create \ |
| 292 | + --header 'Content-Type: application/json' \ |
| 293 | + --header 'Authorization: Bearer <API_KEY>' \ |
| 294 | + --data '{ |
| 295 | + "type": "action_external", |
| 296 | + "config": { |
| 297 | + "name": "Meeting_Booking_Assistant", |
| 298 | + "description": "Book a meeting for a 30 minute or 1 hour call.", |
| 299 | + "url": "http://example.com/booking", |
| 300 | + "speak_on_send": true, |
| 301 | + "speak_on_receive": true, |
| 302 | + "input_schema": { |
| 303 | + "type": "object", |
| 304 | + "properties": { |
| 305 | + "length": { |
| 306 | + "type": "string", |
| 307 | + "enum": ["30m", "1hr"] |
| 308 | + }, |
| 309 | + "time": { |
| 310 | + "type": "string", |
| 311 | + "pattern": "^\\d{2}:\\d0[ap]m$" |
| 312 | + } |
| 313 | + } |
| 314 | + } |
| 315 | + } |
| 316 | +}' |
| 317 | +``` |
| 318 | + |
| 319 | +</CodeGroup> |
| 320 | + |
| 321 | +<Card title="💡 Note" color="#ca8b04"> |
| 322 | + If you're looking to attach the created External Action to your agent after |
| 323 | + creating it in the example below, theres an example of how to do this on the |
| 324 | + [Using Actions](/using-actions#transfercall) page under the TransferCall |
| 325 | + section! |
| 326 | +</Card> |
0 commit comments