Skip to content

Commit b74e595

Browse files
authored
Merge pull request #403 from eric-urban/eur/speech-refresh
remove luis doc and refresh
2 parents 281d200 + 64f2f86 commit b74e595

12 files changed

+36
-317
lines changed

articles/ai-services/.openpublishing.redirection.ai-services.json

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,11 @@
3030
"redirect_url": "/azure/ai-services/language-service/conversational-language-understanding/how-to/migrate-from-luis",
3131
"redirect_document_id": false
3232
},
33+
{
34+
"source_path_from_root": "/articles/ai-services/luis/luis-concept-data-conversion.md",
35+
"redirect_url": "/azure/ai-services/language-service/conversational-language-understanding/how-to/migrate-from-luis",
36+
"redirect_document_id": false
37+
},
3338
{
3439
"source_path_from_root": "/articles/ai-services/custom-vision-service/update-application-to-3.0-sdk.md",
3540
"redirect_url": "/azure/ai-services/custom-vision-service/overview",
@@ -405,6 +410,11 @@
405410
"redirect_url": "/azure/ai-services/speech-service/release-notes",
406411
"redirect_document_id": false
407412
},
413+
{
414+
"source_path_from_root": "/articles/ai-services/speech-service/how-to-recognize-intents-from-speech-csharp.md",
415+
"redirect_url": "/azure/ai-services/speech-service/intent-recognition",
416+
"redirect_document_id": false
417+
},
408418
{
409419
"source_path_from_root": "/articles/ai-services/anomaly-detector/how-to/postman.md",
410420
"redirect_url": "/azure/ai-services/anomaly-detector/overview",

articles/ai-services/luis/faq.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,10 +24,6 @@ LUIS has several limit areas. The first is the model limit, which controls inten
2424

2525
An authoring resource lets you create, manage, train, test, and publish your applications. A prediction resource lets you query your prediction endpoint beyond the 1,000 requests provided by the authoring resource. See [Authoring and query prediction endpoint keys in LUIS](luis-how-to-azure-subscription.md) to learn about the differences between the authoring key and the prediction runtime key.
2626

27-
## Does LUIS support speech to text?
28-
29-
Yes, [Speech](../speech-service/how-to-recognize-intents-from-speech-csharp.md#luis-and-speech) to text is provided as an integration with LUIS.
30-
3127
## What are Synonyms and word variations?
3228

3329
LUIS has little or no knowledge of the broader _NLP_ aspects, such as semantic similarity, without explicit identification in examples. For example, the following tokens (words) are three different things until they're used in similar contexts in the examples provided:

articles/ai-services/luis/luis-concept-data-conversion.md

Lines changed: 0 additions & 42 deletions
This file was deleted.

articles/ai-services/luis/luis-limits.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -95,10 +95,6 @@ Use the _kind_, `LUIS`, when filtering resources in the Azure portal.The LUIS qu
9595

9696
[Sentiment analysis integration](how-to/publish.md), which provides sentiment information, is provided without requiring another Azure resource.
9797

98-
### Speech integration
99-
100-
[Speech integration](../speech-service/how-to-recognize-intents-from-speech-csharp.md) provides 1 thousand endpoint requests per unit cost.
101-
10298
[Learn more about pricing.][pricing]
10399

104100
## Keyboard controls

articles/ai-services/luis/toc.yml

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -74,8 +74,6 @@ items:
7474
items:
7575
- name: With Bing Spell Check v7
7676
href: luis-tutorial-bing-spellcheck.md
77-
- name: With Speech service
78-
href: ../speech-service/how-to-recognize-intents-from-speech-csharp.md?toc=/azure/ai-services/luis/toc.json&bc=/azure/ai-services/luis/breadcrumb/toc.json
7977
- name: With LUIS and question answering using orchestration
8078
href: how-to/orchestration-projects.md
8179
- name: Migrate to conversational language understanding
@@ -157,8 +155,6 @@ items:
157155
href: luis-concept-data-alteration.md
158156
- name: Data retention
159157
href: luis-concept-data-storage.md
160-
- name: Data conversion
161-
href: luis-concept-data-conversion.md
162158
- name: Data extraction
163159
href: luis-concept-data-extraction.md
164160
- name: Security

articles/ai-services/speech-service/how-to-custom-voice-training-data.md

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,9 @@ author: eric-urban
66
manager: nitinme
77
ms.service: azure-ai-speech
88
ms.topic: how-to
9-
ms.date: 1/21/2024
9+
ms.date: 9/20/2024
1010
ms.author: eur
11+
#Customer intent: As a developer, I want to learn about the data types that I can use to train a custom neural voice.
1112
---
1213

1314
# Training data for custom neural voice
@@ -28,8 +29,8 @@ This table lists data types and how each is used to create a custom Text to spee
2829
| Data type | Description | When to use | Extra processing required |
2930
| --------- | ----------- | ----------- | ------------------------------ |
3031
| [Individual utterances + matching transcript](#individual-utterances--matching-transcript) | A collection (.zip) of audio files (.wav) as individual utterances. Each audio file should be 15 seconds or less in length, paired with a formatted transcript (.txt). | Professional recordings with matching transcripts | Ready for training. |
31-
| [Long audio + transcript](#long-audio--transcript-preview) | A collection (.zip) of long, unsegmented audio files (.wav or .mp3, longer than 20 seconds, at most 1000 audio files), paired with a collection (.zip) of transcripts that contains all spoken words. | You have audio files and matching transcripts, but they aren't segmented into utterances. | Segmentation (using batch transcription).<br>Audio format transformation wherever required. |
32-
| [Audio only (Preview)](#audio-only-preview) | A collection (.zip) of audio files (.wav or .mp3, at most 1000 audio files) without a transcript. | You only have audio files available, without transcripts. | Segmentation + transcript generation (using batch transcription).<br>Audio format transformation wherever required.|
32+
| [Long audio + transcript](#long-audio--transcript-preview) | A collection (.zip) of long, unsegmented audio files (.wav or .mp3, longer than 20 seconds, at most 1,000 audio files), paired with a collection (.zip) of transcripts that contains all spoken words. | You have audio files and matching transcripts, but they aren't segmented into utterances. | Segmentation (using batch transcription).<br>Audio format transformation wherever required. |
33+
| [Audio only (Preview)](#audio-only-preview) | A collection (.zip) of audio files (.wav or .mp3, at most 1,000 audio files) without a transcript. | You only have audio files available, without transcripts. | Segmentation + transcript generation (using batch transcription).<br>Audio format transformation wherever required.|
3334

3435
Files should be grouped by type into a dataset and uploaded as a zip file. Each dataset can only contain a single data type.
3536

@@ -107,12 +108,12 @@ Follow these guidelines when preparing audio for segmentation.
107108
| Sample format |RIFF(.wav): PCM, at least 16-bit.<br/><br/>mp3: At least 256 KBps bit rate.|
108109
| Audio length | Longer than 20 seconds |
109110
| Archive format | .zip |
110-
| Maximum archive size | 2048 MB, at most 1000 audio files included |
111+
| Maximum archive size | 2048 MB, at most 1,000 audio files included |
111112

112113
> [!NOTE]
113114
> The default sampling rate for a custom neural voice is 24,000 Hz. Audio files with a sampling rate lower than 16,000 Hz will be rejected. Your audio files with a sampling rate higher than 16,000 Hz and lower than 24,000 Hz will be up-sampled to 24,000 Hz to train a neural voice. It's recommended that you should use a sample rate of 24,000 Hz for your training data.
114115
115-
All audio files should be grouped into a zip file. It's OK to put .wav files and .mp3 files into the same zip file. For example, you can upload a 45 second audio file named 'kingstory.wav' and a 200 second long audio file named 'queenstory.mp3' in the same zip file. All .mp3 files will be transformed into the .wav format after processing.
116+
All audio files should be grouped into a zip file. It's OK to put .wav files and .mp3 files into the same zip file. For example, you can upload a 45-second audio file named 'kingstory.wav' and a 200-second long audio file named 'queenstory.mp3' in the same zip file. All .mp3 files will be transformed into the .wav format after processing.
116117

117118
### Transcription data for Long audio + transcript
118119

@@ -126,7 +127,7 @@ Transcripts must be prepared to the specifications listed in this table. Each au
126127
| # of utterances per line | No limit |
127128
| Maximum file size | 2048 MB |
128129

129-
All transcripts files in this data type should be grouped into a zip file. For example, you might upload a 45 second audio file named 'kingstory.wav' and a 200 second long audio file named 'queenstory.mp3' in the same zip file. You need to upload another zip file containing the corresponding two transcripts--one named 'kingstory.txt' and the other one named 'queenstory.txt'. Within each plain text file, you provide the full correct transcription for the matching audio.
130+
All transcripts files in this data type should be grouped into a zip file. For example, you might upload a 45-second audio file named 'kingstory.wav' and a 200-second long audio file named 'queenstory.mp3' in the same zip file. You need to upload another zip file containing the corresponding two transcripts--one named 'kingstory.txt' and the other one named 'queenstory.txt'. Within each plain text file, you provide the full correct transcription for the matching audio.
130131

131132
After your dataset is successfully uploaded, we'll help you segment the audio file into utterances based on the transcript provided. You can check the segmented utterances and the matching transcripts by downloading the dataset. Unique IDs are assigned to the segmented utterances automatically. It's important that you make sure the transcripts you provide are 100% accurate. Errors in the transcripts can reduce the accuracy during the audio segmentation and further introduce quality loss in the training phase that comes later.
132133

@@ -150,7 +151,7 @@ Follow these guidelines when preparing audio.
150151
| Sample format |RIFF(.wav): PCM, at least 16-bit<br>mp3: At least 256 KBps bit rate.|
151152
| Audio length | No limit |
152153
| Archive format | .zip |
153-
| Maximum archive size | 2048 MB, at most 1000 audio files included |
154+
| Maximum archive size | 2048 MB, at most 1,000 audio files included |
154155

155156
> [!NOTE]
156157
> The default sampling rate for a custom neural voice is 24,000 Hz. Your audio files with a sampling rate higher than 16,000 Hz and lower than 24,000 Hz will be up-sampled to 24,000 Hz to train a neural voice. It's recommended that you should use a sample rate of 24,000 Hz for your training data.

articles/ai-services/speech-service/how-to-get-speech-session-id.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,14 @@
22
title: How to get speech to text session ID and transcription ID
33
titleSuffix: Azure AI services
44
description: Learn how to get speech to text session ID and transcription ID
5-
author: alexeyo26
5+
author: eric-urban
6+
ms.author: eur
67
manager: nitinme
78
ms.service: azure-ai-speech
89
ms.topic: how-to
9-
ms.date: 1/21/2024
10-
ms.author: alexeyo
10+
ms.date: 9/20/2024
11+
ms.reviewer: alexeyo
12+
#Customer intent: As a developer, I need to know how to get the session ID and transcription ID for speech to text so that I can debug issues with my application.
1113
---
1214

1315
# How to get speech to text session ID and transcription ID
@@ -68,7 +70,7 @@ spx help translate log
6870

6971
Unlike Speech SDK, [Speech to text REST API for short audio](rest-speech-to-text-short.md) doesn't automatically generate a Session ID. You need to generate it yourself and provide it within the REST request.
7072

71-
Generate a GUID inside your code or using any standard tool. Use the GUID value *without dashes or other dividers*. As an example we'll use `9f4ffa5113a846eba289aa98b28e766f`.
73+
Generate a GUID inside your code or using any standard tool. Use the GUID value *without dashes or other dividers*. As an example we use `9f4ffa5113a846eba289aa98b28e766f`.
7274

7375
As a part of your REST request use `X-ConnectionId=<GUID>` expression. For our example, a sample request looks like this:
7476
```http

articles/ai-services/speech-service/how-to-lower-speech-synthesis-latency.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,16 +7,16 @@ ms.author: eur
77
manager: nitinme
88
ms.service: azure-ai-speech
99
ms.topic: how-to
10-
ms.date: 1/21/2024
10+
ms.date: 9/20/2024
1111
ms.reviewer: yulili
1212
ms.custom: references_regions, devx-track-extended-java, devx-track-python
1313
zone_pivot_groups: programming-languages-set-nineteen
14+
#Customer intent: As a developer, I need to know how to lower speech synthesis latency using Speech SDK so that I can improve the performance of my application.
1415
---
1516

1617
# Lower speech synthesis latency using Speech SDK
1718

18-
The synthesis latency is critical to your applications.
19-
In this article, we'll introduce the best practices to lower the latency and bring the best performance to your end users.
19+
In this article, we introduce the best practices to lower the text to speech synthesis latency and bring the best performance to your end users.
2020

2121
Normally, we measure the latency by `first byte latency` and `finish latency`, as follows:
2222

@@ -298,12 +298,12 @@ SPXConnection* connection = [[SPXConnection alloc]initFromSpeechSynthesizer:synt
298298
::: zone-end
299299
300300
> [!NOTE]
301-
> If the synthesize text is available, just call `SpeakTextAsync` to synthesize the audio. The SDK will handle the connection.
301+
> If the text is available, just call `SpeakTextAsync` to synthesize the audio. The SDK will handle the connection.
302302
303303
### Reuse SpeechSynthesizer
304304
305305
Another way to reduce the connection latency is to reuse the `SpeechSynthesizer` so you don't need to create a new `SpeechSynthesizer` for each synthesis.
306-
We recommend using object pool in service scenario, see our sample code for [C#](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/csharp/sharedcontent/console/speech_synthesis_server_scenario_sample.cs) and [Java](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/java/jre/console/src/com/microsoft/cognitiveservices/speech/samples/console/SpeechSynthesisScenarioSamples.java).
306+
We recommend using object pool in service scenario. See our sample code for [C#](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/csharp/sharedcontent/console/speech_synthesis_server_scenario_sample.cs) and [Java](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/java/jre/console/src/com/microsoft/cognitiveservices/speech/samples/console/SpeechSynthesisScenarioSamples.java).
307307
308308
309309
## Transmit compressed audio over the network
@@ -313,10 +313,10 @@ Meanwhile, a compressed audio format helps to save the users' network bandwidth,
313313
314314
We support many compressed formats including `opus`, `webm`, `mp3`, `silk`, and so on, see the full list in [SpeechSynthesisOutputFormat](/cpp/cognitive-services/speech/microsoft-cognitiveservices-speech-namespace#speechsynthesisoutputformat).
315315
For example, the bitrate of `Riff24Khz16BitMonoPcm` format is 384 kbps, while `Audio24Khz48KBitRateMonoMp3` only costs 48 kbps.
316-
Our Speech SDK will automatically use a compressed format for transmission when a `pcm` output format is set.
316+
The Speech SDK automatically uses a compressed format for transmission when a `pcm` output format is set.
317317
For Linux and Windows, `GStreamer` is required to enable this feature.
318318
Refer [this instruction](how-to-use-codec-compressed-audio-input-streams.md) to install and configure `GStreamer` for Speech SDK.
319-
For Android, iOS and macOS, no extra configuration is needed starting version 1.20.
319+
For Android, iOS, and macOS, no extra configuration is needed starting version 1.20.
320320
321321
## Input text streaming
322322

articles/ai-services/speech-service/how-to-migrate-to-custom-neural-voice.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,9 @@ ms.author: eur
77
manager: nitinme
88
ms.service: azure-ai-speech
99
ms.topic: how-to
10-
ms.date: 1/21/2024
10+
ms.date: 9/20/2024
1111
ms.reviewer: v-baolianzou
12+
#Customer intent: As a developer, I need to know how to migrate from custom voice to custom neural voice so that I can use the latest technology in my applications.
1213
---
1314

1415
# Migrate from custom voice to custom neural voice

articles/ai-services/speech-service/how-to-migrate-to-prebuilt-neural-voice.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,9 @@ author: eric-urban
66
manager: nitinme
77
ms.service: azure-ai-speech
88
ms.topic: how-to
9-
ms.date: 1/21/2024
9+
ms.date: 9/20/2024
1010
ms.author: eur
11+
#Customer intent: As a developer, I need to know how to migrate from prebuilt standard voice to prebuilt neural voice so that I can use the latest technology in my applications.
1112
---
1213

1314
# Migrate from prebuilt standard voice to prebuilt neural voice

0 commit comments

Comments
 (0)