Skip to content

Commit abecce0

Browse files
Mac Wilkinsonsrhinosajar98
authored
Add External Actions Front Facing Docs (vocodedev#531)
* Add External Actions Front Facing Docs * Fix Python Formatting * Change Some Wording + Elaborate Further * Remove Duplicated Code Example and Just Redirect Instead * Update docs/external-actions.mdx --------- Co-authored-by: srhinos <[email protected]> Co-authored-by: Ajay Raj <[email protected]>
1 parent 5eb89f1 commit abecce0

File tree

3 files changed

+343
-4
lines changed

3 files changed

+343
-4
lines changed

docs/external-actions.mdx

Lines changed: 326 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,326 @@
1+
---
2+
title: "[Beta] External Actions"
3+
description: "Have your agent communicate with an External API"
4+
---
5+
6+
External Actions allow Vocode agents to take actions outside the realm of a phone call. In particular, Vocode agents can decide to _push_ information to external systems via an API request, and _pull_ information from the API response in order to:
7+
8+
1. change the agent’s behavior based on the pulled information
9+
2. give the agent context to inform the rest of the phone call
10+
11+
## How it Works
12+
13+
### Configuring the External Action
14+
15+
The Vocode Agent will determine after each turn of conversation if its the ideal time to interact with the External API based primarily on the configured External Action's `description` and `input_schema`!
16+
17+
#### `input_schema` Field
18+
19+
The `input_schema` field is a [JSON Schema](https://json-schema.org/) object that instructs how to properly form a payload to send to the External API.
20+
21+
For example, in the [Meeting Assistant Example](/external-actions#meeting-assistant-example) below we formed the following JSON payload:
22+
23+
```json
24+
{
25+
"type": "object",
26+
"properties": {
27+
"length": {
28+
"type": "string",
29+
"enum": ["30m", "1hr"]
30+
},
31+
"time": {
32+
"type": "string",
33+
"pattern": "^d{2}:d0[ap]m$"
34+
}
35+
}
36+
}
37+
```
38+
39+
This is stating the External API is expecting:
40+
41+
- Two fields
42+
- `length` (string): either "30m" or "1hr"
43+
- `time` (string): a regex pattern defining a time ending in a zero with `am`/`pm` on the end ie: `10:30am`
44+
45+
<Card title="💡 Note" color="#ca8b04">
46+
If you’re noticing that this looks very familiar to OpenAI function calling, it is! The Vocode API treats OpenAI LLMs as first-class and uses the function calling API when the agent uses an OpenAI LLM.
47+
48+
The lone difference is that the top level `input_schema` JSON schema must be an `object` - this is so we can use JSON to send over parameters to the user’s API.
49+
50+
</Card>
51+
52+
#### `description` Field
53+
54+
The `description` is best used to descibe your External Action's purpose. As its passed through directly to the LLM, its the best way to convey instructions to the underlying Vocode Agent.
55+
56+
For example, in the [Meeting Assistant Example](/external-actions#meeting-assistant-example) below we want to book a meeting for 30 minutes to an hour so we set the description as `Book a meeting for a 30 minute or 1 hour call.`
57+
58+
<Card title="💡 Note" color="#ca8b04">
59+
The `description` field is passed through and heavily affects how we do our
60+
function decisioning so we recommend treating it in the same way you would a
61+
prompt to an LLM!
62+
</Card>
63+
64+
#### Other Fields to Determine Agent Behavior
65+
66+
- `speak_on_send`: if `True`, then the underlying LLM will generate a message to be spoken into the phone call as the
67+
API request is being sent. - `url`: The API request is sent to this URL in the format
68+
defined below in [Responding to External Action API Requests](/external-actions#responding-to-external-action-api-requests)
69+
70+
- `speak_on_receive`: if `True`, then the Vocode Agent will invoke the underlying
71+
LLM to respond based on the result from the API Response or the Error encountered.
72+
73+
### Responding to External Action API Requests
74+
75+
Once an External Action has been created, the Vocode Agent will issue API requests to the defined `url` during the course of a phone call based on the [configuration noted above](/external-actions#configuring-the-external-action)
76+
The Vocode API will wait a maximum of _10 seconds_ before timing out the request.
77+
78+
In particular, Vocode will issue a POST request to `url` with a JSON payload that matches `input_schema` , specifically (using the [Meeting Assistant Example](/external-actions#meeting-assistant-example) below):
79+
80+
```bash
81+
POST url HTTP/1.1
82+
Accept: application/json
83+
Content-Type: application/json
84+
x-vocode-signature: <encoded_signature>
85+
86+
{
87+
"call_id": <UUID>,
88+
"payload": {
89+
"length": "30m",
90+
"time": "10:30am"
91+
}
92+
}
93+
```
94+
95+
#### Signature Validation
96+
97+
A cryptographically signed signature of the request body and a randomly generated byte hash in included as a header (under `x-vocode-signature`) in the outbound request so that the user’s API can validate the identity of the incoming request.
98+
99+
The signature secret is contained in the External Action's API object and can be found when creating an object (as noted below in the [Meeting Assistant Example](/external-actions#meeting-assistant-example)), or by getting the API object via the `/v1/actions?id=ACTION_ID` endpoint:
100+
101+
<CodeGroup>
102+
```bash Example cURL Request
103+
curl --request GET \
104+
--url https://api.vocode.dev/v1/actions?id=<EXAMPLE_ACTION_ID>\
105+
--header 'Content-Type: application/json' \
106+
--header 'Authorization: Bearer <API_KEY>'
107+
```
108+
109+
```json Response
110+
{
111+
"id": "<EXAMPLE_ACTION_ID>",
112+
"user_id": "ecd792cf-18a2-420b-91f5-cdaf22f5f562",
113+
"type": "action_external",
114+
"config": {
115+
"processing_mode": "muted",
116+
"name": "Meeting_Booking_Assistant",
117+
"description": "Book a meeting for a 30 minute or 1 hour call.",
118+
"url": "http://example.com/booking",
119+
"input_schema": {
120+
"type": "object",
121+
"properties": {
122+
"length": {
123+
"type": "string",
124+
"enum": ["30m", "1hr"]
125+
},
126+
"time": {
127+
"type": "string",
128+
"pattern": "^\\d{2}:\\d0[ap]m$"
129+
}
130+
}
131+
},
132+
"speak_on_send": true,
133+
"speak_on_receive": true,
134+
"signature_secret": "MX/9/+iblnUoAAM2Jft8sgeY1bevJvuih2nr7XKPHIY="
135+
},
136+
"action_trigger": {
137+
"type": "action_trigger_function_call",
138+
"config": {}
139+
}
140+
}
141+
```
142+
143+
</CodeGroup>
144+
Use the following code snippet to check the signature in an inbound request:
145+
146+
<CodeGroup>
147+
```python Python
148+
import base64
149+
import hashlib
150+
import hmac
151+
152+
async def test_requester_encodes_signature(
153+
request_signature_value: str, signature_secret: str, payload: dict
154+
):
155+
"""
156+
Asynchronous function to check if the request signature is encoded correctly.
157+
158+
Args:
159+
request_signature_value (str): The request signature to be decoded.
160+
signature_secret (str): The signature to be decoded and used for comparison.
161+
payload (dict): The payload to be used for digest calculation.
162+
163+
Returns:
164+
None
165+
"""
166+
signature_secret_as_bytes = base64.b64decode(signature_secret)
167+
decoded_digest = base64.b64decode(request_signature_value)
168+
calculated_digest = hmac.new(signature_secret_as_bytes, payload, hashlib.sha256).digest()
169+
assert hmac.compare_digest(decoded_digest, calculated_digest) is True
170+
171+
````
172+
173+
```typescript TypeScript
174+
import * as crypto from 'crypto';
175+
176+
async function testRequesterEncodesSignature(
177+
requestSignatureValue: string,
178+
signatureSecret: string,
179+
payload: Record<string, unknown>
180+
): Promise<void> {
181+
/**
182+
* Asynchronous function to check if the request signature is encoded correctly.
183+
*
184+
* @param requestSignatureValue - The request signature to be decoded.
185+
* @param signatureSecret - The signature to be decoded and used for comparison.
186+
* @param payload - The payload to be used for digest calculation.
187+
*/
188+
const signatureAsBytes = Buffer.from(signatureSecret, 'base64');
189+
const decodedDigest = Buffer.from(requestSignatureValue, 'base64');
190+
const payloadString = JSON.stringify(payload);
191+
const calculatedDigest = crypto
192+
.createHmac('sha256', signatureAsBytes)
193+
.update(payloadString)
194+
.digest();
195+
196+
if (!crypto.timingSafeEqual(decodedDigest, calculatedDigest)) {
197+
throw new Error('Signature mismatch');
198+
}
199+
}
200+
````
201+
202+
</CodeGroup>
203+
204+
#### Response Formatting
205+
206+
Vocode expects responses from the user’s API in JSON in the following format:
207+
208+
```python
209+
Response {
210+
result: Any
211+
agent_message: Optional[str] = None
212+
}
213+
```
214+
215+
- `result` is a payload containing the result of the action on the user’s side, and can be in any format
216+
- `agent_message` optionally contains a message that will be synthesized into audio and sent back to the phone call (see [Configuring the External Action](/external-actions#configuring-the-external-action) above for more info)
217+
218+
In the [Meeting Assistant Example](/external-actions#meeting-assistant-example) below, the user’s API could return back a JSON response that looks like:
219+
220+
```json
221+
{
222+
"result": {
223+
"success": true
224+
},
225+
"agent_message": "I've set up a calendar appointment at 10:30am tomorrow for 30 minutes"
226+
}
227+
```
228+
229+
## Meeting Assistant Example:
230+
231+
This is an example of a Meeting Assistant which will attempt to book a meeting for 30 minutes or an hour at any time ending in a zero (ie 10:30am is okay but 10:35am is not)
232+
233+
<CodeGroup>
234+
235+
```python Python
236+
vocode_client.actions.create_action(
237+
request={
238+
"type": "action_external",
239+
"config": {
240+
"name": "Meeting_Booking_Assistant",
241+
"description": ("Book a meeting for a 30 minute or 1 hour call."),
242+
"url": "http://example.com/booking",
243+
"speak_on_send": True,
244+
"speak_on_receive": True,
245+
"input_schema": {
246+
"type": "object",
247+
"properties": {
248+
"length": {
249+
"type": "string",
250+
"enum": ["30m", "1hr"],
251+
},
252+
"time": {
253+
"type": "string",
254+
"pattern": "^\d{2}:\d0[ap]m$",
255+
},
256+
},
257+
},
258+
},
259+
},
260+
)
261+
```
262+
263+
```typescript TypeScript
264+
const action = await vocode.actions.createAction({
265+
type: "action_external",
266+
config: {
267+
name: "Meeting_Booking_Assistant",
268+
description: "Book a meeting for a 30 minute or 1 hour call.",
269+
url: "http://example.com/booking",
270+
speak_on_send: true,
271+
speak_on_receive: true,
272+
input_schema: {
273+
type: "object",
274+
properties: {
275+
length: {
276+
type: "string",
277+
enum: ["30m", "1hr"],
278+
},
279+
time: {
280+
type: "string",
281+
pattern: "^\\d{2}:\\d0[ap]m$",
282+
},
283+
},
284+
},
285+
},
286+
});
287+
```
288+
289+
```bash cURL
290+
curl --request POST \
291+
--url https://api.vocode.dev/v1/actions/create \
292+
--header 'Content-Type: application/json' \
293+
--header 'Authorization: Bearer <API_KEY>' \
294+
--data '{
295+
"type": "action_external",
296+
"config": {
297+
"name": "Meeting_Booking_Assistant",
298+
"description": "Book a meeting for a 30 minute or 1 hour call.",
299+
"url": "http://example.com/booking",
300+
"speak_on_send": true,
301+
"speak_on_receive": true,
302+
"input_schema": {
303+
"type": "object",
304+
"properties": {
305+
"length": {
306+
"type": "string",
307+
"enum": ["30m", "1hr"]
308+
},
309+
"time": {
310+
"type": "string",
311+
"pattern": "^\\d{2}:\\d0[ap]m$"
312+
}
313+
}
314+
}
315+
}
316+
}'
317+
```
318+
319+
</CodeGroup>
320+
321+
<Card title="💡 Note" color="#ca8b04">
322+
If you're looking to attach the created External Action to your agent after
323+
creating it in the example below, theres an example of how to do this on the
324+
[Using Actions](/using-actions#transfercall) page under the TransferCall
325+
section!
326+
</Card>

docs/mint.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -179,6 +179,7 @@
179179
"conversational-dials",
180180
"setting-up-webhook",
181181
"using-actions",
182+
"external-actions",
182183
"vectordb",
183184
"multilingual",
184185
"injecting-context",

docs/using-actions.mdx

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,11 @@ title: "Using actions"
33
description: "Make agents on your phone calls take actions"
44
---
55

6-
Agents on phone calls can take three actions:
6+
Agents on phone calls can take four actions:
77

8-
`EndConversation`: allows the agent to end the call, e.g. if the user says "Goodbye!"
8+
#### EndConversation
9+
10+
`EndConversation` allows the agent to end the call, e.g. if the user says "Goodbye!"
911

1012
<CodeGroup>
1113

@@ -35,7 +37,9 @@ curl --request POST \
3537

3638
</CodeGroup>
3739

38-
`DTMF`: allows the agent to hit dial tones during a call, e.g. navigating a phone tree
40+
#### DTMF
41+
42+
`DTMF` allows the agent to hit dial tones during a call, e.g. navigating a phone tree
3943

4044
<CodeGroup>
4145

@@ -65,7 +69,9 @@ curl --request POST \
6569

6670
</CodeGroup>
6771

68-
`TransferCall`: allows the agent to transfer the call to another phone number
72+
#### TransferCall
73+
74+
`TransferCall` allows the agent to transfer the call to another phone number
6975

7076
<CodeGroup>
7177

@@ -196,3 +202,9 @@ curl --request POST \
196202
```
197203

198204
</CodeGroup>
205+
206+
#### [Beta] ExternalAction
207+
208+
`ExternalAction` allows your agent communicate with an External API and include the response as context in the conversation.
209+
210+
See the dedicated [External Action's](/external-actions) page for more info!

0 commit comments

Comments
 (0)