Skip to content

Commit b623dc8

Browse files
committed
Method update
1 parent f725a62 commit b623dc8

File tree

7 files changed

+451
-15
lines changed

7 files changed

+451
-15
lines changed

sdk/ai/Azure.AI.VoiceLive/Azure.AI.VoiceLive.sln

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
Microsoft Visual Studio Solution File, Format Version 12.00
22
# Visual Studio Version 17
3-
VisualStudioVersion = 17.14.36301.6 d17.14
3+
VisualStudioVersion = 17.14.36301.6
44
MinimumVisualStudioVersion = 10.0.40219.1
55
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "Azure.Core.TestFramework", "..\..\core\Azure.Core.TestFramework\src\Azure.Core.TestFramework.csproj", "{ECC730C1-4AEA-420C-916A-66B19B79E4DC}"
66
EndProject
77
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "Azure.AI.VoiceLive", "src\Azure.AI.VoiceLive.csproj", "{28FF4005-4467-4E36-92E7-DEA27DEB1519}"
88
EndProject
99
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "Azure.AI.VoiceLive.Tests", "tests\Azure.AI.VoiceLive.Tests.csproj", "{1F1CD1D4-9932-4B73-99D8-C252A67D4B46}"
1010
EndProject
11-
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "BasicVoiceAssistant", "..\..\..\..\..\..\scratch\vl_sample\samples\BasicVoiceAssistant.csproj", "{4F423188-2AE3-CB57-5BE2-808B33B8B5AB}"
11+
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "BasicVoiceAssistant", "samples\BasicVoiceAssistant\BasicVoiceAssistant.csproj", "{C4593C0B-D995-3C0E-2B84-AE24A0B02506}"
1212
EndProject
1313
Global
1414
GlobalSection(SolutionConfigurationPlatforms) = preSolution
@@ -28,10 +28,10 @@ Global
2828
{1F1CD1D4-9932-4B73-99D8-C252A67D4B46}.Debug|Any CPU.Build.0 = Debug|Any CPU
2929
{1F1CD1D4-9932-4B73-99D8-C252A67D4B46}.Release|Any CPU.ActiveCfg = Release|Any CPU
3030
{1F1CD1D4-9932-4B73-99D8-C252A67D4B46}.Release|Any CPU.Build.0 = Release|Any CPU
31-
{4F423188-2AE3-CB57-5BE2-808B33B8B5AB}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
32-
{4F423188-2AE3-CB57-5BE2-808B33B8B5AB}.Debug|Any CPU.Build.0 = Debug|Any CPU
33-
{4F423188-2AE3-CB57-5BE2-808B33B8B5AB}.Release|Any CPU.ActiveCfg = Release|Any CPU
34-
{4F423188-2AE3-CB57-5BE2-808B33B8B5AB}.Release|Any CPU.Build.0 = Release|Any CPU
31+
{C4593C0B-D995-3C0E-2B84-AE24A0B02506}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
32+
{C4593C0B-D995-3C0E-2B84-AE24A0B02506}.Debug|Any CPU.Build.0 = Debug|Any CPU
33+
{C4593C0B-D995-3C0E-2B84-AE24A0B02506}.Release|Any CPU.ActiveCfg = Release|Any CPU
34+
{C4593C0B-D995-3C0E-2B84-AE24A0B02506}.Release|Any CPU.Build.0 = Release|Any CPU
3535
EndGlobalSection
3636
GlobalSection(SolutionProperties) = preSolution
3737
HideSolutionNode = FALSE
Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
# VoiceLive Session Command Methods
2+
3+
This document describes the new convenience methods added to the `VoiceLiveSession` class to provide a more developer-friendly API similar to the OpenAI SDK.
4+
5+
## Overview
6+
7+
The following convenience methods have been added to provide an easier way to send control messages to the VoiceLive service without requiring developers to manually construct and populate `ClientEvent` classes.
8+
9+
## New Methods Added
10+
11+
### Audio Stream Management
12+
13+
#### `ClearStreamingAudioAsync` / `ClearStreamingAudio`
14+
Clears all input audio currently being streamed.
15+
16+
```csharp
17+
// Async version
18+
await session.ClearStreamingAudioAsync(cancellationToken);
19+
20+
// Sync version
21+
session.ClearStreamingAudio(cancellationToken);
22+
```
23+
24+
**Underlying ClientEvent:** `ClientEventInputAudioClear`
25+
26+
### Audio Turn Management
27+
28+
#### `StartAudioTurnAsync` / `StartAudioTurn`
29+
Starts a new audio input turn with a unique identifier.
30+
31+
```csharp
32+
string turnId = Guid.NewGuid().ToString();
33+
34+
// Async version
35+
await session.StartAudioTurnAsync(turnId, cancellationToken);
36+
37+
// Sync version
38+
session.StartAudioTurn(turnId, cancellationToken);
39+
```
40+
41+
**Underlying ClientEvent:** `ClientEventInputAudioTurnStart`
42+
43+
#### `AppendAudioToTurnAsync` / `AppendAudioToTurn`
44+
Appends audio data to an ongoing input turn. Available in two overloads for different audio data types.
45+
46+
```csharp
47+
string turnId = "some-turn-id";
48+
byte[] audioData = GetAudioBytes();
49+
BinaryData audioBinary = BinaryData.FromBytes(audioData);
50+
51+
// With byte array - Async
52+
await session.AppendAudioToTurnAsync(turnId, audioData, cancellationToken);
53+
54+
// With byte array - Sync
55+
session.AppendAudioToTurn(turnId, audioData, cancellationToken);
56+
57+
// With BinaryData - Async
58+
await session.AppendAudioToTurnAsync(turnId, audioBinary, cancellationToken);
59+
60+
// With BinaryData - Sync
61+
session.AppendAudioToTurn(turnId, audioBinary, cancellationToken);
62+
```
63+
64+
**Underlying ClientEvent:** `ClientEventInputAudioTurnAppend`
65+
66+
#### `EndAudioTurnAsync` / `EndAudioTurn`
67+
Marks the end of an audio input turn.
68+
69+
```csharp
70+
string turnId = "some-turn-id";
71+
72+
// Async version
73+
await session.EndAudioTurnAsync(turnId, cancellationToken);
74+
75+
// Sync version
76+
session.EndAudioTurn(turnId, cancellationToken);
77+
```
78+
79+
**Underlying ClientEvent:** `ClientEventInputAudioTurnEnd`
80+
81+
#### `CancelAudioTurnAsync` / `CancelAudioTurn`
82+
Cancels an in-progress input audio turn.
83+
84+
```csharp
85+
string turnId = "some-turn-id";
86+
87+
// Async version
88+
await session.CancelAudioTurnAsync(turnId, cancellationToken);
89+
90+
// Sync version
91+
session.CancelAudioTurn(turnId, cancellationToken);
92+
```
93+
94+
**Underlying ClientEvent:** `ClientEventInputAudioTurnCancel`
95+
96+
### Avatar Management
97+
98+
#### `ConnectAvatarAsync` / `ConnectAvatar`
99+
Connects and provides the client's SDP (Session Description Protocol) for avatar-related media negotiation.
100+
101+
```csharp
102+
string clientSdp = GetClientSdpOffer();
103+
104+
// Async version
105+
await session.ConnectAvatarAsync(clientSdp, cancellationToken);
106+
107+
// Sync version
108+
session.ConnectAvatar(clientSdp, cancellationToken);
109+
```
110+
111+
**Underlying ClientEvent:** `ClientEventSessionAvatarConnect`
112+
113+
## Complete Audio Turn Example
114+
115+
Here's a complete example showing how to use the audio turn management methods:
116+
117+
```csharp
118+
using Azure.AI.VoiceLive;
119+
120+
public async Task HandleAudioTurnAsync(VoiceLiveSession session, Stream audioStream)
121+
{
122+
string turnId = Guid.NewGuid().ToString();
123+
124+
try
125+
{
126+
// Start the audio turn
127+
await session.StartAudioTurnAsync(turnId);
128+
129+
// Read and append audio data in chunks
130+
byte[] buffer = new byte[4096];
131+
int bytesRead;
132+
133+
while ((bytesRead = await audioStream.ReadAsync(buffer, 0, buffer.Length)) > 0)
134+
{
135+
byte[] audioChunk = new byte[bytesRead];
136+
Array.Copy(buffer, audioChunk, bytesRead);
137+
138+
await session.AppendAudioToTurnAsync(turnId, audioChunk);
139+
}
140+
141+
// End the audio turn
142+
await session.EndAudioTurnAsync(turnId);
143+
}
144+
catch (Exception ex)
145+
{
146+
// Cancel the turn if something goes wrong
147+
await session.CancelAudioTurnAsync(turnId);
148+
throw;
149+
}
150+
}
151+
```
152+
153+
## Design Principles
154+
155+
These methods follow the established patterns in the VoiceLive SDK:
156+
157+
1. **Both sync and async versions** are provided for all methods
158+
2. **Proper parameter validation** using `Argument.AssertNotNull` and `Argument.AssertNotNullOrEmpty`
159+
3. **Disposal checking** using `ThrowIfDisposed()`
160+
4. **Consistent naming** that describes the action rather than just mirroring the event type
161+
5. **Comprehensive documentation** with parameter descriptions and exception information
162+
6. **JSON serialization** for sending commands, consistent with existing methods
163+
164+
## Previously Existing Methods
165+
166+
The following convenience methods were already available and remain unchanged:
167+
168+
- **Audio Buffer Management:** `SendInputAudioAsync`, `ClearInputAudioAsync`, `CommitInputAudioAsync`
169+
- **Session Configuration:** `ConfigureSessionAsync`, `ConfigureConversationSessionAsync`, `ConfigureTranscriptionSessionAsync`
170+
- **Item Management:** `AddItemAsync`, `RequestItemRetrievalAsync`, `DeleteItemAsync`, `TruncateConversationAsync`
171+
- **Response Management:** `StartResponseAsync`, `CancelResponseAsync`
172+
173+
The new methods complement these existing ones to provide comprehensive coverage of all available `ClientEvent` types.

sdk/ai/Azure.AI.VoiceLive/samples/BasicVoiceAssistant/AudioProcessor.cs

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,15 @@ namespace Azure.AI.VoiceLive.Samples;
99
/// <summary>
1010
/// Handles real-time audio capture and playback for the voice assistant.
1111
///
12+
/// This processor demonstrates some of the new VoiceLive SDK convenience methods:
13+
/// - Uses existing SendInputAudioAsync() method for audio streaming
14+
/// - Shows how convenience methods simplify audio operations
15+
///
16+
/// Additional convenience methods available in the SDK:
17+
/// - StartAudioTurnAsync() / AppendAudioToTurnAsync() / EndAudioTurnAsync() - Audio turn management
18+
/// - ClearStreamingAudioAsync() - Clear all streaming audio
19+
/// - ConnectAvatarAsync() - Avatar connection with SDP
20+
///
1221
/// Threading Architecture:
1322
/// - Main thread: Event loop and UI
1423
/// - Capture thread: NAudio input stream reading
@@ -68,6 +77,7 @@ public AudioProcessor(VoiceLiveSession session, ILogger<AudioProcessor> logger)
6877
_audioPlaybackReader = _audioPlaybackChannel.Reader;
6978

7079
_cancellationTokenSource = new CancellationTokenSource();
80+
_playbackCancellationTokenSource = new CancellationTokenSource();
7181

7282
_logger.LogInformation("AudioProcessor initialized with {SampleRate}Hz PCM16 mono audio", SampleRate);
7383
}
@@ -284,7 +294,9 @@ private async Task ProcessAudioSendAsync(CancellationToken cancellationToken)
284294

285295
try
286296
{
287-
// Send audio data directly to the session
297+
// Send audio data directly to the session using the convenience method
298+
// This demonstrates the existing SendInputAudioAsync convenience method
299+
// Other available methods: StartAudioTurnAsync, AppendAudioToTurnAsync, EndAudioTurnAsync
288300
await _session.SendInputAudioAsync(audioData, cancellationToken).ConfigureAwait(false);
289301
}
290302
catch (Exception ex)
@@ -374,4 +386,4 @@ public void Dispose()
374386
CleanupAsync().GetAwaiter().GetResult();
375387
_cancellationTokenSource.Dispose();
376388
}
377-
}
389+
}

sdk/ai/Azure.AI.VoiceLive/samples/BasicVoiceAssistant/BasicVoiceAssistant.cs

Lines changed: 27 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,19 @@ namespace Azure.AI.VoiceLive.Samples;
77

88
/// <summary>
99
/// Basic voice assistant implementing the VoiceLive SDK patterns.
10+
///
11+
/// This sample now demonstrates some of the new convenience methods added to the VoiceLive SDK:
12+
/// - ClearStreamingAudioAsync() - Clears all input audio currently being streamed
13+
/// - CancelResponseAsync() - Cancels the current response generation (existing method)
14+
/// - ConfigureConversationSessionAsync() - Configures session options (existing method)
15+
///
16+
/// Additional convenience methods available but not shown in this sample:
17+
/// - StartAudioTurnAsync() / EndAudioTurnAsync() / CancelAudioTurnAsync() - Audio turn management
18+
/// - AppendAudioToTurnAsync() - Append audio data to an ongoing turn
19+
/// - ConnectAvatarAsync() - Connect avatar with SDP for media negotiation
20+
///
21+
/// These methods provide a more developer-friendly API similar to the OpenAI SDK,
22+
/// eliminating the need to manually construct and populate ClientEvent classes.
1023
/// </summary>
1124
public class BasicVoiceAssistant : IDisposable
1225
{
@@ -19,7 +32,6 @@ public class BasicVoiceAssistant : IDisposable
1932

2033
private VoiceLiveSession? _session;
2134
private AudioProcessor? _audioProcessor;
22-
private bool _sessionReady;
2335
private bool _disposed;
2436

2537
/// <summary>
@@ -176,8 +188,7 @@ private async Task HandleServerEventAsync(ServerEvent serverEvent, CancellationT
176188

177189
case ServerEventSessionUpdated sessionUpdated:
178190
_logger.LogInformation("Session updated successfully");
179-
_sessionReady = true;
180-
191+
181192
// Start audio capture once session is ready
182193
if (_audioProcessor != null)
183194
{
@@ -204,6 +215,17 @@ private async Task HandleServerEventAsync(ServerEvent serverEvent, CancellationT
204215
{
205216
_logger.LogDebug(ex, "No response to cancel");
206217
}
218+
219+
// Demonstrate the new ClearStreamingAudio convenience method
220+
try
221+
{
222+
await _session!.ClearStreamingAudioAsync(cancellationToken).ConfigureAwait(false);
223+
_logger.LogInformation("✨ Used ClearStreamingAudioAsync convenience method");
224+
}
225+
catch (Exception ex)
226+
{
227+
_logger.LogDebug(ex, "ClearStreamingAudio call failed (may not be supported in all scenarios)");
228+
}
207229
break;
208230

209231
case ServerEventInputAudioBufferSpeechStopped speechStopped:
@@ -258,8 +280,7 @@ private async Task HandleServerEventAsync(ServerEvent serverEvent, CancellationT
258280
private async Task HandleSessionCreatedAsync(ServerEventSessionCreated sessionCreated, CancellationToken cancellationToken)
259281
{
260282
_logger.LogInformation("Session ready: {SessionId}", sessionCreated.Session?.Id);
261-
_sessionReady = true;
262-
283+
263284
// Start audio capture once session is ready
264285
if (_audioProcessor != null)
265286
{
@@ -279,4 +300,4 @@ public void Dispose()
279300
_session?.Dispose();
280301
_disposed = true;
281302
}
282-
}
303+
}

sdk/ai/Azure.AI.VoiceLive/samples/BasicVoiceAssistant/BasicVoiceAssistant.csproj

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
</ItemGroup>
2424

2525
<ItemGroup>
26-
<ProjectReference Include="..\..\..\git\sdk-repos\azure-sdk-for-net\sdk\ai\Azure.AI.VoiceLive\src\Azure.AI.VoiceLive.csproj" />
26+
<ProjectReference Include="..\..\src\Azure.AI.VoiceLive.csproj" />
2727
</ItemGroup>
2828

2929
<ItemGroup>

sdk/ai/Azure.AI.VoiceLive/samples/BasicVoiceAssistant/README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,24 @@
22

33
This sample demonstrates the fundamental capabilities of the Azure VoiceLive SDK by creating a basic voice assistant that can engage in natural conversation with proper interruption handling. This serves as the foundational example that showcases the core value proposition of unified speech-to-speech interaction.
44

5+
## New VoiceLive SDK Convenience Methods
6+
7+
This sample now demonstrates some of the new convenience methods added to the VoiceLive SDK for better developer experience:
8+
9+
**Used in this sample:**
10+
- `ClearStreamingAudioAsync()` - Clears all input audio currently being streamed
11+
- `ConfigureConversationSessionAsync()` - Configures conversation session options
12+
- `CancelResponseAsync()` - Cancels the current response generation
13+
- `SendInputAudioAsync()` - Sends audio data to the service
14+
15+
**Additional convenience methods available:**
16+
- `StartAudioTurnAsync()` / `EndAudioTurnAsync()` / `CancelAudioTurnAsync()` - Audio turn management
17+
- `AppendAudioToTurnAsync()` - Append audio data to an ongoing turn
18+
- `ConnectAvatarAsync()` - Connect avatar with SDP for media negotiation
19+
- `CommitInputAudioAsync()` / `ClearInputAudioAsync()` - Audio buffer operations
20+
21+
These methods eliminate the need to manually construct and populate `ClientEvent` classes, providing a more developer-friendly API similar to the OpenAI SDK.
22+
523
## Features
624

725
- **Real-time voice conversation**: Seamless bidirectional audio streaming

0 commit comments

Comments
 (0)