Skip to content

Commit 52d9133

Browse files
xitzhangXiting Zhang
andauthored
[VoiceLive] Add sample of function calling (#47516)
* [VoiceLive] Add sample of function calling * update --------- Co-authored-by: Xiting Zhang <[email protected]>
1 parent 5cd16e7 commit 52d9133

File tree

3 files changed

+779
-2
lines changed

3 files changed

+779
-2
lines changed

sdk/ai/azure-ai-voicelive/README.md

Lines changed: 71 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,7 @@ The following sections provide code snippets for common scenarios:
125125
* [Send audio input](#send-audio-input)
126126
* [Handle event types](#handle-event-types)
127127
* [Voice configuration](#voice-configuration)
128+
* [Function calling](#function-calling)
128129
* [Complete voice assistant with microphone](#complete-voice-assistant-with-microphone)
129130

130131
### Focused Sample Files
@@ -158,9 +159,16 @@ For easier learning, explore these focused samples in order:
158159
- Noise reduction and echo cancellation
159160
- Multi-threaded audio processing
160161

161-
> **Note:** To run audio samples (AudioPlaybackSample, MicrophoneInputSample, VoiceAssistantSample):
162+
6. **FunctionCallingSample.java** - Voice assistant with custom function tools
163+
- Define function tools with parameters
164+
- Register functions with the VoiceLive session
165+
- Handle function call requests from the AI model
166+
- Execute functions locally and return results
167+
- Continue conversation with function results
168+
169+
> **Note:** To run audio samples (AudioPlaybackSample, MicrophoneInputSample, VoiceAssistantSample, FunctionCallingSample):
162170
> ```bash
163-
> mvn exec:java -Dexec.mainClass=com.azure.ai.voicelive.AudioPlaybackSample -Dexec.classpathScope=test
171+
> mvn exec:java -Dexec.mainClass=com.azure.ai.voicelive.FunctionCallingSample -Dexec.classpathScope=test
164172
> ```
165173
> These samples use `javax.sound.sampled` for audio I/O.
166174
@@ -328,6 +336,67 @@ VoiceLiveSessionOptions options3 = new VoiceLiveSessionOptions()
328336
new AzurePersonalVoice("speakerProfileId", PersonalVoiceModels.PHOENIX_LATEST_NEURAL)));
329337
```
330338
339+
### Function calling
340+
341+
Enable your voice assistant to call custom functions during conversations. This allows the AI to request information or perform actions by executing your code:
342+
343+
```java com.azure.ai.voicelive.functioncalling
344+
// 1. Define function tool with parameters
345+
VoiceLiveFunctionDefinition getWeatherFunction = new VoiceLiveFunctionDefinition("get_current_weather")
346+
.setDescription("Get the current weather in a given location")
347+
.setParameters(BinaryData.fromObject(parametersSchema)); // JSON schema
348+
349+
// 2. Configure session with tools
350+
VoiceLiveSessionOptions options = new VoiceLiveSessionOptions()
351+
.setTools(Arrays.asList(getWeatherFunction))
352+
.setInstructions("You have access to weather information. Use get_current_weather when asked about weather.");
353+
354+
// 3. Handle function call events
355+
client.startSession("gpt-4o-realtime-preview")
356+
.flatMap(session -> {
357+
session.receiveEvents()
358+
.subscribe(event -> {
359+
if (event instanceof SessionUpdateConversationItemCreated) {
360+
SessionUpdateConversationItemCreated itemCreated = (SessionUpdateConversationItemCreated) event;
361+
if (itemCreated.getItem().getType() == ItemType.FUNCTION_CALL) {
362+
ResponseFunctionCallItem functionCall = (ResponseFunctionCallItem) itemCreated.getItem();
363+
364+
// Wait for arguments
365+
String callId = functionCall.getCallId();
366+
String arguments = waitForArguments(session, callId); // Helper method
367+
368+
// Execute function
369+
try {
370+
Map<String, Object> result = getCurrentWeather(arguments);
371+
String resultJson = new ObjectMapper().writeValueAsString(result);
372+
373+
// Return result
374+
FunctionCallOutputItem output = new FunctionCallOutputItem(callId, resultJson);
375+
ClientEventConversationItemCreate createItem = new ClientEventConversationItemCreate()
376+
.setItem(output)
377+
.setPreviousItemId(functionCall.getId());
378+
379+
session.sendEvent(createItem).subscribe();
380+
session.sendEvent(new ClientEventResponseCreate()).subscribe();
381+
} catch (Exception e) {
382+
System.err.println("Error executing function: " + e.getMessage());
383+
}
384+
}
385+
}
386+
});
387+
388+
return Mono.just(session);
389+
})
390+
.block();
391+
```
392+
393+
**Key points:**
394+
* Define function tools with JSON schemas describing parameters
395+
* The AI decides when to call functions based on conversation context
396+
* Your code executes the function and returns results
397+
* Results are sent back to continue the conversation
398+
* See `FunctionCallingSample.java` for a complete working example
399+
331400
### Complete voice assistant with microphone
332401
333402
A full example demonstrating real-time microphone input and audio playback:

0 commit comments

Comments
 (0)