Skip to content

Commit 46f075e

Browse files
authored
Merge pull request #278303 from MicrosoftDocs/main
6/14/2024 AM Publish
2 parents 37e6b4f + 464f64d commit 46f075e

File tree

146 files changed

+589
-422
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

146 files changed

+589
-422
lines changed

articles/ai-services/openai/assistants-reference-messages.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ Create a message.
3636

3737
|Name | Type | Required | Description |
3838
|--- |--- |--- |--- |
39-
| `role` | string | Required | The role of the entity that is creating the message. Can be `user` or `assistant`. `assistant` indicates the message is sent by an actual user and should be used in most cases to represent user-generated messages. `assistant` indicates the message is generated by the assistant. Use this value to insert messages from the assistant into the conversation. |
39+
| `role` | string | Required | The role of the entity that is creating the message. Can be `user` or `assistant`. `user` indicates the message is sent by an actual user and should be used in most cases to represent user-generated messages. `assistant` indicates the message is generated by the assistant. Use this value to insert messages from the assistant into the conversation. |
4040
| `content` | string | Required | The content of the message. |
4141
| `file_ids` | array | Optional | A list of File IDs that the message should use. There can be a maximum of 10 files attached to a message. Useful for tools like retrieval and code_interpreter that can access and use files. |
4242
| `metadata` | map | Optional | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
@@ -371,7 +371,7 @@ Represents a message within a thread.
371371
| `object` | string |The object type, which is always thread.message.|
372372
| `created_at` | integer |The Unix timestamp (in seconds) for when the message was created.|
373373
| `thread_id` | string |The thread ID that this message belongs to.|
374-
| `role` | string |The entity that produced the message. One of user or assistant.|
374+
| `role` | string |The entity that produced the message. One of `user` or `assistant`.|
375375
| `content` | array |The content of the message in array of text and/or images.|
376376
| `assistant_id` | string or null |If applicable, the ID of the assistant that authored this message.|
377377
| `run_id` | string or null |If applicable, the ID of the run associated with the authoring of this message.|

articles/ai-services/speech-service/how-to-lower-speech-synthesis-latency.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -318,6 +318,73 @@ For Linux and Windows, `GStreamer` is required to enable this feature.
318318
Refer [this instruction](how-to-use-codec-compressed-audio-input-streams.md) to install and configure `GStreamer` for Speech SDK.
319319
For Android, iOS and macOS, no extra configuration is needed starting version 1.20.
320320
321+
## Text streaming
322+
323+
Text streaming allows real-time text processing for rapid audio generation. It's perfect for dynamic text vocalization, such as reading outputs from AI models like GPT in real-time. This feature minimizes latency and improves the fluidity and responsiveness of audio outputs, making it ideal for interactive applications, live events, and responsive AI-driven dialogues.
324+
325+
### How to use text streaming
326+
327+
To use the text streaming feature, connect to the websocket V2 endpoint: `wss://{region}.tts.speech.microsoft.com/cognitiveservices/websocket/v2`
328+
329+
::: zone pivot="programming-language-csharp"
330+
331+
See the sample code for setting the endpoint:
332+
333+
```csharp
334+
// IMPORTANT: MUST use the websocket v2 endpoint
335+
var ttsEndpoint = $"wss://{Environment.GetEnvironmentVariable("AZURE_TTS_REGION")}.tts.speech.microsoft.com/cognitiveservices/websocket/v2";
336+
var speechConfig = SpeechConfig.FromEndpoint(
337+
new Uri(ttsEndpoint),
338+
Environment.GetEnvironmentVariable("AZURE_TTS_API_KEY"));
339+
```
340+
341+
#### Key steps
342+
343+
1. **Create a text stream request**: Use `SpeechSynthesisRequestInputType.TextStream` to initiate a text stream.
344+
1. **Set global properties**: Adjust settings such as output format and voice name directly, as the feature handles partial text inputs and doesn't support SSML. Refer to the following sample code for instructions on how to set them. OpenAI text to speech voices aren't supported by the text streaming feature. See this [language table](language-support.md?tabs=tts#supported-languages) for full language support.
345+
346+
```csharp
347+
// Set output format
348+
speechConfig.SetSpeechSynthesisOutputFormat(SpeechSynthesisOutputFormat.Raw24Khz16BitMonoPcm);
349+
350+
// Set a voice name
351+
SpeechConfig.SetProperty(PropertyId.SpeechServiceConnection_SynthVoice, "en-US-AvaMultilingualNeural");
352+
```
353+
354+
1. **Stream your text**: For each text chunk generated from a GPT model, use `request.InputStream.Write(text);` to send the text to the stream.
355+
1. **Close the stream**: Once the GPT model completes its output, close the stream using `request.InputStream.Close();`.
356+
357+
For detailed implementation, see the [sample code on GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/csharp/tts-text-stream)
358+
359+
::: zone-end
360+
361+
::: zone pivot="programming-language-python"
362+
363+
See the sample code for setting the endpoint:
364+
365+
```python
366+
# IMPORTANT: MUST use the websocket v2 endpoint
367+
speech_config = speechsdk.SpeechConfig(endpoint=f"wss://{os.getenv('AZURE_TTS_REGION')}.tts.speech.microsoft.com/cognitiveservices/websocket/v2",
368+
subscription=os.getenv("AZURE_TTS_API_KEY"))
369+
```
370+
371+
#### Key steps
372+
373+
1. **Create a text stream request**: Use `speechsdk.SpeechSynthesisRequestInputType.TextStream` to initiate a text stream.
374+
1. **Set global properties**: Adjust settings such as output format and voice name directly, as the feature handles partial text inputs and doesn't support SSML. Refer to the following sample code for instructions on how to set them. OpenAI text to speech voices aren't supported by the text streaming feature. See this [language table](language-support.md?tabs=tts#supported-languages) for full language support.
375+
376+
```python
377+
# set a voice name
378+
speech_config.speech_synthesis_voice_name = "en-US-AvaMultilingualNeural"
379+
```
380+
381+
1. **Stream your text**: For each text chunk generated from a GPT model, use `request.input_stream.write(text)` to send the text to the stream.
382+
1. **Close the stream**: Once the GPT model completes its output, close the stream using `request.input_stream.close()`.
383+
384+
For detailed implementation, see the [sample code on GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/python/tts-text-stream).
385+
386+
::: zone-end
387+
321388
## Others tips
322389

323390
### Cache CRL files

articles/ai-services/speech-service/includes/how-to/professional-voice/create-consent/rest.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ ms.custom: include
1111

1212
With the professional voice feature, it's required that every voice be created with explicit consent from the user. A recorded statement from the user is required acknowledging that the customer (Azure AI Speech resource owner) will create and use their voice.
1313

14-
To add voice talent consent to the professional voice project, you get the prerecorded consent audio file from a publicly accessible URL ([Consents_Create](/rest/api/speechapi/consents/create)) or upload the audio file ([Consents_Post](/rest/api/speechapi/consents/post)). In this article, you add consent from a URL.
14+
To add voice talent consent to the professional voice project, you get the prerecorded consent audio file from a publicly accessible URL ([Consents_Create](/rest/api/aiservices/speechapi/consents/create)) or upload the audio file ([Consents_Post](/rest/api/aiservices/speechapi/consents/post)). In this article, you add consent from a URL.
1515

1616
## Consent statement
1717

@@ -25,15 +25,15 @@ You can get the consent statement text for each locale from the text to speech G
2525

2626
## Add consent from a URL
2727

28-
To add consent to a professional voice project from the URL of an audio file, use the [Consents_Create](/rest/api/speechapi/consents/create) operation of the custom voice API. Construct the request body according to the following instructions:
28+
To add consent to a professional voice project from the URL of an audio file, use the [Consents_Create](/rest/api/aiservices/speechapi/consents/create) operation of the custom voice API. Construct the request body according to the following instructions:
2929

3030
- Set the required `projectId` property. See [create a project](../../../../professional-voice-create-project.md).
3131
- Set the required `voiceTalentName` property. The voice talent name can't be changed later.
3232
- Set the required `companyName` property. The company name can't be changed later.
3333
- Set the required `audioUrl` property. The URL of the voice talent consent audio file. Use a URI with the [shared access signatures (SAS)](/azure/storage/common/storage-sas-overview) token.
3434
- Set the required `locale` property. This should be the locale of the consent. The locale can't be changed later. You can find the text to speech locale list [here](/azure/ai-services/speech-service/language-support?tabs=tts).
3535

36-
Make an HTTP PUT request using the URI as shown in the following [Consents_Create](/rest/api/speechapi/consents/create) example.
36+
Make an HTTP PUT request using the URI as shown in the following [Consents_Create](/rest/api/aiservices/speechapi/consents/create) example.
3737
- Replace `YourResourceKey` with your Speech resource key.
3838
- Replace `YourResourceRegion` with your Speech resource region.
3939
- Replace `JessicaConsentId` with a consent ID of your choice. The case sensitive ID will be used in the consent's URI and can't be changed later.
@@ -46,7 +46,7 @@ curl -v -X PUT -H "Ocp-Apim-Subscription-Key: YourResourceKey" -H "Content-Type:
4646
"companyName": "Contoso",
4747
"audioUrl": "https://contoso.blob.core.windows.net/public/jessica-consent.wav?mySasToken",
4848
"locale": "en-US"
49-
} ' "https://YourResourceRegion.api.cognitive.microsoft.com/customvoice/consents/JessicaConsentId?api-version=2023-12-01-preview"
49+
} ' "https://YourResourceRegion.api.cognitive.microsoft.com/customvoice/consents/JessicaConsentId?api-version=2024-02-01-preview"
5050
```
5151

5252
You should receive a response body in the following format:
@@ -65,10 +65,10 @@ You should receive a response body in the following format:
6565
}
6666
```
6767

68-
The response header contains the `Operation-Location` property. Use this URI to get details about the [Consents_Create](/rest/api/speechapi/consents/create) operation. Here's an example of the response header:
68+
The response header contains the `Operation-Location` property. Use this URI to get details about the [Consents_Create](/rest/api/aiservices/speechapi/consents/create) operation. Here's an example of the response header:
6969

7070
```HTTP 201
71-
Operation-Location: https://eastus.api.cognitive.microsoft.com/customvoice/operations/070f7986-ef17-41d0-ba2b-907f0f28e314?api-version=2023-12-01-preview
71+
Operation-Location: https://eastus.api.cognitive.microsoft.com/customvoice/operations/070f7986-ef17-41d0-ba2b-907f0f28e314?api-version=2024-02-01-preview
7272
Operation-Id: 070f7986-ef17-41d0-ba2b-907f0f28e314
7373
```
7474

articles/ai-services/speech-service/includes/how-to/professional-voice/create-project/rest.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,12 +15,12 @@ Each project is specific to a country/region and language, and the gender of the
1515

1616
## Create a project
1717

18-
To create a professional voice project, use the [Projects_Create](/rest/api/speechapi/projects/create) operation of the custom voice API. Construct the request body according to the following instructions:
18+
To create a professional voice project, use the [Projects_Create](/rest/api/aiservices/speechapi/projects/create) operation of the custom voice API. Construct the request body according to the following instructions:
1919

2020
- Set the required `kind` property to `ProfessionalVoice`. The kind can't be changed later.
2121
- Optionally, set the `description` property for the project description. The project description can be changed later.
2222

23-
Make an HTTP PUT request using the URI as shown in the following [Projects_Create](/rest/api/speechapi/projects/create) example.
23+
Make an HTTP PUT request using the URI as shown in the following [Projects_Create](/rest/api/aiservices/speechapi/projects/create) example.
2424
- Replace `YourResourceKey` with your Speech resource key.
2525
- Replace `YourResourceRegion` with your Speech resource region.
2626
- Replace `ProjectId` with a project ID of your choice. The case sensitive ID must be unique within your Speech resource. The ID will be used in the project's URI and can't be changed later.
@@ -29,7 +29,7 @@ Make an HTTP PUT request using the URI as shown in the following [Projects_Creat
2929
curl -v -X PUT -H "Ocp-Apim-Subscription-Key: YourResourceKey" -H "Content-Type: application/json" -d '{
3030
"description": "Project description",
3131
"kind": "ProfessionalVoice"
32-
} ' "https://YourResourceRegion.api.cognitive.microsoft.com/customvoice/projects/ProjectId?api-version=2023-12-01-preview"
32+
} ' "https://YourResourceRegion.api.cognitive.microsoft.com/customvoice/projects/ProjectId?api-version=2024-02-01-preview"
3333
```
3434

3535
You should receive a response body in the following format:

articles/ai-services/speech-service/includes/how-to/professional-voice/create-training-set/rest.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -15,14 +15,14 @@ In this article, you [create a training set](#create-a-training-set) and get its
1515

1616
## Create a training set
1717

18-
To create a training set, use the [TrainingSets_Create](/rest/api/speechapi/training-sets/create) operation of the custom voice API. Construct the request body according to the following instructions:
18+
To create a training set, use the [TrainingSets_Create](/rest/api/aiservices/speechapi/training-sets/create) operation of the custom voice API. Construct the request body according to the following instructions:
1919

2020
- Set the required `projectId` property. See [create a project](../../../../professional-voice-create-project.md).
2121
- Set the required `voiceKind` property to `Male` or `Female`. The kind can't be changed later.
2222
- Set the required `locale` property. This should be the locale of the training set data. The locale of the training set should be the same as the locale of the [consent statement](../../../../professional-voice-create-consent.md). The locale can't be changed later. You can find the text to speech locale list [here](/azure/ai-services/speech-service/language-support?tabs=tts).
2323
- Optionally, set the `description` property for the training set description. The training set description can be changed later.
2424

25-
Make an HTTP PUT request using the URI as shown in the following [TrainingSets_Create](/rest/api/speechapi/training-sets/create) example.
25+
Make an HTTP PUT request using the URI as shown in the following [TrainingSets_Create](/rest/api/aiservices/speechapi/training-sets/create) example.
2626
- Replace `YourResourceKey` with your Speech resource key.
2727
- Replace `YourResourceRegion` with your Speech resource region.
2828
- Replace `JessicaTrainingSetId` with a training set ID of your choice. The case sensitive ID will be used in the training set's URI and can't be changed later.
@@ -33,7 +33,7 @@ curl -v -X PUT -H "Ocp-Apim-Subscription-Key: YourResourceKey" -H "Content-Type:
3333
"projectId": "ProjectId",
3434
"locale": "en-US",
3535
"voiceKind": "Female"
36-
} ' "https://YourResourceRegion.api.cognitive.microsoft.com/customvoice/trainingsets/JessicaTrainingSetId?api-version=2023-12-01-preview"
36+
} ' "https://YourResourceRegion.api.cognitive.microsoft.com/customvoice/trainingsets/JessicaTrainingSetId?api-version=2024-02-01-preview"
3737
```
3838

3939
You should receive a response body in the following format:
@@ -53,7 +53,7 @@ You should receive a response body in the following format:
5353

5454
## Upload training set data
5555

56-
To upload a training set of audio and scripts, use the [TrainingSets_UploadData](/rest/api/speechapi/training-sets/upload-data) operation of the custom voice API.
56+
To upload a training set of audio and scripts, use the [TrainingSets_UploadData](/rest/api/aiservices/speechapi/training-sets/upload-data) operation of the custom voice API.
5757

5858
Before calling this API, please store recording and script files in Azure Blob. In the example below, recording files are https://contoso.blob.core.windows.net/voicecontainer/jessica300/*.wav, script files are
5959
https://contoso.blob.core.windows.net/voicecontainer/jessica300/*.txt.
@@ -70,7 +70,7 @@ Construct the request body according to the following instructions:
7070
- Set the required `extensions` property to the extensions of the script files.
7171
- Optionally, set the `prefix` property to set a prefix for the blob name.
7272

73-
Make an HTTP POST request using the URI as shown in the following [TrainingSets_UploadData](/rest/api/speechapi/training-sets/upload-data) example.
73+
Make an HTTP POST request using the URI as shown in the following [TrainingSets_UploadData](/rest/api/aiservices/speechapi/training-sets/upload-data) example.
7474
- Replace `YourResourceKey` with your Speech resource key.
7575
- Replace `YourResourceRegion` with your Speech resource region.
7676
- Replace `JessicaTrainingSetId` if you specified a different training set ID in the previous step.
@@ -92,13 +92,13 @@ curl -v -X POST -H "Ocp-Apim-Subscription-Key: YourResourceKey" -H "Content-Type
9292
".txt"
9393
]
9494
}
95-
} ' "https://YourResourceRegion.api.cognitive.microsoft.com/customvoice/trainingsets/JessicaTrainingSetId:upload?api-version=2023-12-01-preview"
95+
} ' "https://YourResourceRegion.api.cognitive.microsoft.com/customvoice/trainingsets/JessicaTrainingSetId:upload?api-version=2024-02-01-preview"
9696
```
9797

98-
The response header contains the `Operation-Location` property. Use this URI to get details about the [TrainingSets_UploadData](/rest/api/speechapi/training-sets/upload-data) operation. Here's an example of the response header:
98+
The response header contains the `Operation-Location` property. Use this URI to get details about the [TrainingSets_UploadData](/rest/api/aiservices/speechapi/training-sets/upload-data) operation. Here's an example of the response header:
9999

100100
```HTTP 201
101-
Operation-Location: https://eastus.api.cognitive.microsoft.com/customvoice/operations/284b7e37-f42d-4054-8fa9-08523c3de345?api-version=2023-12-01-preview
101+
Operation-Location: https://eastus.api.cognitive.microsoft.com/customvoice/operations/284b7e37-f42d-4054-8fa9-08523c3de345?api-version=2024-02-01-preview
102102
Operation-Id: 284b7e37-f42d-4054-8fa9-08523c3de345
103103
```
104104

0 commit comments

Comments
 (0)