Skip to content

Commit 46d8caa

Browse files
xitzhangXiting ZhangCopilot
authored
[VoiceLive] Add async function-calling agent sample (#42978)
* [VoiceLive] Add async function-calling agent sample * add phrase list * fix typo * Update sdk/ai/azure-ai-voicelive/samples/async_function_calling_sample.py Co-authored-by: Copilot <[email protected]> * Update sdk/ai/azure-ai-voicelive/samples/async_function_calling_sample.py Co-authored-by: Copilot <[email protected]> * update * fix typo * update changelog * update * remove breaking change section * update changelog * fix change log * revert changelog I lost --------- Co-authored-by: Xiting Zhang <[email protected]> Co-authored-by: Copilot <[email protected]>
1 parent eb43bd0 commit 46d8caa

File tree

10 files changed

+896
-65
lines changed

10 files changed

+896
-65
lines changed

sdk/ai/azure-ai-voicelive/.env.template

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# Copy this file to .env and fill in your values
33

44
# Required credentials
5-
AZURE_VOICELIVE_KEY=your-voicelive-api-key
5+
AZURE_VOICELIVE_API_KEY=your-voicelive-api-key
66
AZURE_VOICELIVE_ENDPOINT=wss://api.voicelive.com/v1
77

88
# Optional configuration

sdk/ai/azure-ai-voicelive/CHANGELOG.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,23 @@
11
# Release History
22

3+
## 1.0.0b3 (Unreleased)
4+
5+
### Features Added
6+
7+
- Phrase list
8+
9+
### Breaking Changes
10+
11+
- Removed `custom_model` and `enabled` from `AudioInputTranscriptionSettings`.
12+
313
## 1.0.0b2 (2025-09-10)
414

515
### Features Added
616

717
- Async function call
818

919
### Bugs Fixed
20+
1021
- Fixed function calling: ensure `FunctionCallOutputItem.output` is properly serialized as a JSON string before sending to the service.
1122

1223
## 1.0.0b1 (2025-08-28)

sdk/ai/azure-ai-voicelive/azure/ai/voicelive/_version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,4 @@
66
# Changes may cause incorrect behavior and will be lost if the code is regenerated.
77
# --------------------------------------------------------------------------
88

9-
VERSION = "1.0.0b2"
9+
VERSION = "1.0.0b3"

sdk/ai/azure-ai-voicelive/azure/ai/voicelive/models/_models.py

Lines changed: 37 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -252,47 +252,53 @@ def __init__(self, *args: Any, **kwargs: Any) -> None:
252252
class AudioInputTranscriptionSettings(_Model):
253253
"""Configuration for input audio transcription.
254254
255-
:ivar model: The model used for transcription. E.g., 'whisper-1', 'azure-fast-transcription',
256-
's2s-ingraph'. Required. Is one of the following types: Literal["whisper-1"],
257-
Literal["azure-fast-transcription"], Literal["s2s-ingraph"]
255+
:ivar model: The transcription model to use. Supported values:
256+
"whisper-1", "gpt-4o-transcribe", "gpt-4o-mini-transcribe",
257+
"azure-fast-transcription", "azure-speech". Required.
258258
:vartype model: str
259-
:ivar language: The language code to use for transcription, if specified.
260-
:vartype language: str
261-
:ivar enabled: Whether transcription is enabled. Required.
262-
:vartype enabled: bool
263-
:ivar custom_model: Whether a custom model is being used. Required.
264-
:vartype custom_model: bool
259+
:ivar language: Optional BCP-47 language code (e.g., "en-US").
260+
:vartype language: str | None
261+
:ivar custom_speech: Optional configuration for custom speech models.
262+
:vartype custom_speech: dict[str, str] | None
263+
:ivar phrase_list: Optional list of phrase hints to bias recognition.
264+
:vartype phrase_list: list[str] | None
265265
"""
266266

267-
model: Literal["whisper-1", "azure-fast-transcription", "s2s-ingraph"] = rest_field(
268-
visibility=["read", "create", "update", "delete", "query"]
269-
)
270-
"""The model used for transcription. E.g., 'whisper-1', 'azure-fast-transcription', 's2s-ingraph'.
271-
Required. Is one of the following types: Literal[\"whisper-1\"],
272-
Literal[\"azure-fast-transcription\"], Literal[\"s2s-ingraph\"]"""
267+
model: Literal[
268+
"whisper-1",
269+
"gpt-4o-transcribe",
270+
"gpt-4o-mini-transcribe",
271+
"azure-fast-transcription",
272+
"azure-speech",
273+
] = rest_field(visibility=["read", "create", "update", "delete", "query"])
274+
"""Required transcription model."""
275+
273276
language: Optional[str] = rest_field(visibility=["read", "create", "update", "delete", "query"])
274-
"""The language code to use for transcription, if specified."""
275-
enabled: bool = rest_field(visibility=["read", "create", "update", "delete", "query"])
276-
"""Whether transcription is enabled. Required."""
277-
custom_model: bool = rest_field(visibility=["read", "create", "update", "delete", "query"])
278-
"""Whether a custom model is being used. Required."""
277+
"""Optional language code (e.g., 'en-US')."""
278+
279+
custom_speech: Optional[Dict[str, str]] = rest_field(visibility=["read", "create", "update", "delete", "query"])
280+
"""Optional custom speech configuration."""
281+
282+
phrase_list: Optional[List[str]] = rest_field(visibility=["read", "create", "update", "delete", "query"])
283+
"""Optional phrase hints."""
279284

280285
@overload
281286
def __init__(
282287
self,
283288
*,
284-
model: Literal["whisper-1", "azure-fast-transcription", "s2s-ingraph"],
285-
enabled: bool,
286-
custom_model: bool,
287-
language: Optional[str] = None,
289+
model: Literal[
290+
"whisper-1",
291+
"gpt-4o-transcribe",
292+
"gpt-4o-mini-transcribe",
293+
"azure-fast-transcription",
294+
"azure-speech",
295+
],
296+
language: Optional[str] = ...,
297+
custom_speech: Optional[Dict[str, str]] = ...,
298+
phrase_list: Optional[List[str]] = ...,
288299
) -> None: ...
289-
290300
@overload
291-
def __init__(self, mapping: Mapping[str, Any]) -> None:
292-
"""
293-
:param mapping: raw JSON to initialize the model.
294-
:type mapping: Mapping[str, Any]
295-
"""
301+
def __init__(self, mapping: Mapping[str, Any]) -> None: ...
296302

297303
def __init__(self, *args: Any, **kwargs: Any) -> None:
298304
super().__init__(*args, **kwargs)
@@ -3317,7 +3323,7 @@ def __init__(self, *args: Any, **kwargs: Any) -> None:
33173323
super().__init__(*args, **kwargs)
33183324

33193325
@classmethod
3320-
def deserialize(cls, payload: dict[str, Any]) -> "ServerEvent":
3326+
def deserialize(cls, payload: Dict[str, Any]) -> "ServerEvent":
33213327
# public, linter-friendly entrypoint
33223328
# pylint: disable-next=protected-access
33233329
return cls._deserialize(payload, [])

sdk/ai/azure-ai-voicelive/pyproject.toml

Lines changed: 36 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,42 +1,55 @@
1+
# --------------------------------------------------------------------------
2+
# Copyright (c) Microsoft Corporation. All rights reserved.
3+
# Licensed under the MIT License. See License.txt in the project root for license information.
4+
# Code generated by Microsoft (R) Python Code Generator.
5+
# Changes may cause incorrect behavior and will be lost if the code is regenerated.
6+
# --------------------------------------------------------------------------
7+
18
[build-system]
29
requires = ["setuptools>=77.0.3", "wheel"]
310
build-backend = "setuptools.build_meta"
411

512
[project]
613
name = "azure-ai-voicelive"
7-
authors = [{ name = "Microsoft Corporation", email = "[email protected]" }]
8-
description = "Microsoft Corporation Azure AI VoiceLive Client Library for Python"
14+
authors = [
15+
{ name = "Microsoft Corporation", email = "[email protected]" },
16+
]
17+
description = "Microsoft Corporation Azure Ai Voicelive Client Library for Python"
918
license = "MIT"
1019
classifiers = [
11-
"Development Status :: 4 - Beta",
12-
"Programming Language :: Python",
13-
"Programming Language :: Python :: 3 :: Only",
14-
"Programming Language :: Python :: 3",
15-
"Programming Language :: Python :: 3.9",
16-
"Programming Language :: Python :: 3.10",
17-
"Programming Language :: Python :: 3.11",
18-
"Programming Language :: Python :: 3.12",
19-
"Programming Language :: Python :: 3.13",
20+
"Development Status :: 4 - Beta",
21+
"Programming Language :: Python",
22+
"Programming Language :: Python :: 3 :: Only",
23+
"Programming Language :: Python :: 3",
24+
"Programming Language :: Python :: 3.9",
25+
"Programming Language :: Python :: 3.10",
26+
"Programming Language :: Python :: 3.11",
27+
"Programming Language :: Python :: 3.12",
28+
"Programming Language :: Python :: 3.13",
2029
]
2130
requires-python = ">=3.9"
22-
keywords = ["azure", "azure sdk", "voice", "voicelive", "realtime", "websocket", "audio"]
31+
keywords = ["azure", "azure sdk"]
2332

2433
dependencies = [
25-
"isodate>=0.6.1",
26-
"azure-core>=1.35.0",
27-
"typing-extensions>=4.6.0",
34+
"isodate>=0.6.1",
35+
"azure-core>=1.35.0",
36+
"typing-extensions>=4.6.0",
37+
]
38+
dynamic = [
39+
"version", "readme"
2840
]
29-
30-
dynamic = ["version", "readme"]
3141

3242
[project.optional-dependencies]
33-
aiohttp = ["aiohttp>=3.9.0,<4.0.0"]
34-
websockets = ["websockets>=12.0,<14.0"]
43+
aiohttp = [
44+
"aiohttp>=3.9.0,<4.0.0",
45+
]
46+
websockets = [
47+
"websockets>=12.0,<14.0",
48+
]
3549
all-websockets = [
3650
"aiohttp>=3.9.0,<4.0.0",
3751
"websockets>=12.0,<14.0",
3852
]
39-
4053
test = [
4154
"pytest>=8.0",
4255
"pytest-asyncio>=0.23",
@@ -50,17 +63,14 @@ test = [
5063
]
5164

5265
[project.urls]
53-
Repository = "https://github.com/Azure/azure-sdk-for-python"
66+
repository = "https://github.com/Azure/azure-sdk-for-python"
5467

5568
[tool.setuptools.dynamic]
56-
version = { attr = "azure.ai.voicelive._version.VERSION" }
57-
readme = { file = ["README.md", "CHANGELOG.md"], content-type = "text/markdown" }
69+
version = {attr = "azure.ai.voicelive._version.VERSION"}
70+
readme = {file = ["README.md", "CHANGELOG.md"], content-type = "text/markdown"}
5871

5972
[tool.setuptools.packages.find]
6073
include = ["azure.ai.voicelive", "azure.ai.voicelive.*"]
6174

6275
[tool.setuptools.package-data]
6376
"azure.ai.voicelive" = ["py.typed"]
64-
65-
[tool.azure-sdk-build]
66-
verifytypes = false

sdk/ai/azure-ai-voicelive/samples/BASIC_VOICE_ASSISTANT.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ This sample demonstrates the fundamental capabilities of the Azure VoiceLive SDK
3434

3535
Or set environment variables directly:
3636
```bash
37-
export AZURE_VOICELIVE_KEY="your-api-key"
37+
export AZURE_VOICELIVE_API_KEY="your-api-key"
3838
export AZURE_VOICELIVE_ENDPOINT="wss://api.voicelive.com/v1"
3939
export VOICELIVE_MODEL="gpt-4o-realtime-preview"
4040
export VOICELIVE_VOICE="en-US-AvaNeural"
@@ -63,7 +63,7 @@ python basic_voice_assistant.py \
6363

6464
### Available Options
6565

66-
- `--api-key`: Azure VoiceLive API key (or use AZURE_VOICELIVE_KEY env var)
66+
- `--api-key`: Azure VoiceLive API key (or use AZURE_VOICELIVE_API_KEY env var)
6767
- `--endpoint`: VoiceLive endpoint URL
6868
- `--model`: Model to use (default: gpt-4o-realtime-preview)
6969
- `--voice`: Voice for the assistant (alloy, echo, fable, onyx, nova, shimmer, en-US-AvaNeural, etc.)

sdk/ai/azure-ai-voicelive/samples/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ This directory contains sample applications demonstrating various capabilities o
2121
Create a `.env` file at the root of the azure-ai-voicelive directory or in the samples directory with the following variables:
2222

2323
```ini
24-
AZURE_VOICELIVE_KEY=your-voicelive-api-key
24+
AZURE_VOICELIVE_API_KEY=your-voicelive-api-key
2525
AZURE_VOICELIVE_ENDPOINT=wss://api.voicelive.com/v1
2626
VOICELIVE_MODEL=gpt-4o-realtime-preview
2727
VOICELIVE_VOICE=alloy
@@ -107,7 +107,7 @@ python sample_voicelive_async.py --help
107107
- Confirm your network allows WSS to the service
108108

109109
- **Auth errors**
110-
- For API key: confirm `AZURE_VOICELIVE_KEY`
110+
- For API key: confirm `AZURE_VOICELIVE_API_KEY`
111111
- For AAD: ensure your identity has access to the resource
112112

113113
## Next steps

0 commit comments

Comments
 (0)