Skip to content

Commit 4a9ba0c

Browse files
authored
Merge pull request #3157 from eric-urban/eur/custom-speech-foundry-portal
basic pivot framework to add custom speech in foundry portal
2 parents d46027f + c892057 commit 4a9ba0c

File tree

45 files changed

+382
-344
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+382
-344
lines changed

articles/ai-services/.openpublishing.redirection.ai-services.json

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -460,6 +460,13 @@
460460
"redirect_url": "/azure/ai-services/speech-service/how-to-custom-speech-model-and-endpoint-lifecycle",
461461
"redirect_document_id": false
462462
},
463+
{
464+
"source_path_from_root": "/articles/ai-services/speech-service/custom-speech-ai-foundry-portal.md",
465+
"redirect_url": "/azure/ai-services/speech-service/how-to-custom-speech-create-project",
466+
"redirect_document_id": false
467+
},
468+
469+
463470
{
464471
"source_path_from_root": "/articles/ai-services/anomaly-detector/how-to/postman.md",
465472
"redirect_url": "/azure/ai-services/anomaly-detector/overview",
@@ -777,7 +784,7 @@
777784
},
778785
{
779786
"source_path_from_root": "/articles/ai-services/speech-service/custom-speech-ai-studio.md",
780-
"redirect_url": "/azure/ai-services/speech-service/custom-speech-ai-foundry-portal",
787+
"redirect_url": "/azure/ai-services/speech-service/how-to-custom-speech-create-project",
781788
"redirect_document_id": true
782789
},
783790
{

articles/ai-services/speech-service/batch-transcription-create.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -246,7 +246,7 @@ To use a custom speech model for batch transcription, you need the model's URI.
246246
> [!TIP]
247247
> A [hosted deployment endpoint](how-to-custom-speech-deploy-model.md) isn't required to use custom speech with the batch transcription service. You can conserve resources if you use the [custom speech model](how-to-custom-speech-train-model.md) only for batch transcription.
248248
249-
Batch transcription requests for expired models fail with a 4xx error. Set the `model` property to a base model or custom model that isn't expired. Otherwise don't include the `model` property to always use the latest base model. For more information, see [Choose a model](how-to-custom-speech-create-project.md#choose-your-model) and [Custom speech model lifecycle](how-to-custom-speech-model-and-endpoint-lifecycle.md).
249+
Batch transcription requests for expired models fail with a 4xx error. Set the `model` property to a base model or custom model that isn't expired. Otherwise don't include the `model` property to always use the latest base model. For more information, see [Choose a model](./custom-speech-overview.md#choose-your-model) and [Custom speech model lifecycle](how-to-custom-speech-model-and-endpoint-lifecycle.md).
250250

251251
## Use a Whisper model
252252

articles/ai-services/speech-service/custom-speech-ai-foundry-portal.md

Lines changed: 0 additions & 127 deletions
This file was deleted.

articles/ai-services/speech-service/custom-speech-overview.md

Lines changed: 29 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
---
22
title: Custom speech overview - Speech service
33
titleSuffix: Azure AI services
4-
description: Custom speech is a set of online tools that allows you to evaluate and improve the speech to text accuracy for your applications, tools, and products.
4+
description: Custom speech is allows you to evaluate and improve the speech to text accuracy for your applications, tools, and products.
55
author: eric-urban
66
manager: nitinme
77
ms.service: azure-ai-speech
88
ms.topic: overview
9-
ms.date: 9/15/2024
9+
ms.date: 2/25/2025
1010
ms.author: eur
1111
ms.custom: references_regions
1212
---
@@ -29,17 +29,41 @@ With custom speech, you can upload your own data, test and train a custom model,
2929

3030
Here's more information about the sequence of steps shown in the previous diagram:
3131

32-
1. [Create a project](how-to-custom-speech-create-project.md) and choose a model. Use a <a href="https://portal.azure.com/#create/Microsoft.CognitiveServicesAIServices" title="Create an AI Services resource for Speech" target="_blank">Speech resource</a> that you create in the Azure portal. If you train a custom model with audio data, choose an AI Services resource for Speech region with dedicated hardware for training audio data. For more information, see footnotes in the [regions](regions.md#regions) table.
32+
1. [Create a project](how-to-custom-speech-create-project.md) and choose a model. Use a <a href="https://portal.azure.com/#create/Microsoft.CognitiveServicesAIServices" title="Create an AI Services resource for Speech" target="_blank">Speech resource</a> that you create in the Azure portal. If you train a custom model with audio data, select a service resource in a region with dedicated hardware for training audio data. For more information, see footnotes in the [regions](regions.md#regions) table.
33+
3334
1. [Upload test data](./how-to-custom-speech-upload-data.md). Upload test data to evaluate the speech to text offering for your applications, tools, and products.
34-
1. [Test recognition quality](how-to-custom-speech-inspect-data.md). Use the [Speech Studio](https://aka.ms/speechstudio/customspeech) to play back uploaded audio and inspect the speech recognition quality of your test data.
35-
1. [Test model quantitatively](how-to-custom-speech-evaluate-data.md). Evaluate and improve the accuracy of the speech to text model. The Speech service provides a quantitative word error rate (WER), which you can use to determine if more training is required.
35+
3636
1. [Train a model](how-to-custom-speech-train-model.md). Provide written transcripts and related text, along with the corresponding audio data. Testing a model before and after training is optional but recommended.
37+
3738
> [!NOTE]
3839
> You pay for custom speech model usage and [endpoint hosting](how-to-custom-speech-deploy-model.md). You'll also be charged for custom speech model training if the base model was created on October 1, 2023 and later. You're not charged for training if the base model was created prior to October 2023. For more information, see [Azure AI Speech pricing](https://azure.microsoft.com/pricing/details/cognitive-services/speech-services/) and the [Charge for adaptation section in the speech to text 3.2 migration guide](./migrate-v3-1-to-v3-2.md#charge-for-adaptation).
40+
41+
1. [Test recognition quality](how-to-custom-speech-inspect-data.md). Use the [Speech Studio](https://aka.ms/speechstudio/customspeech) to play back uploaded audio and inspect the speech recognition quality of your test data.
42+
43+
1. [Test model quantitatively](how-to-custom-speech-evaluate-data.md). Evaluate and improve the accuracy of the speech to text model. The Speech service provides a quantitative word error rate (WER), which you can use to determine if more training is required.
44+
3945
1. [Deploy a model](how-to-custom-speech-deploy-model.md). Once you're satisfied with the test results, deploy the model to a custom endpoint. Except for [batch transcription](batch-transcription.md), you must deploy a custom endpoint to use a custom speech model.
46+
4047
> [!TIP]
4148
> A hosted deployment endpoint isn't required to use custom speech with the [Batch transcription API](batch-transcription.md). You can conserve resources if the custom speech model is only used for batch transcription. For more information, see [Speech service pricing](https://azure.microsoft.com/pricing/details/cognitive-services/speech-services/).
4249
50+
## Choose your model
51+
52+
There are a few approaches to using custom speech models:
53+
- The base model provides accurate speech recognition out of the box for a range of [scenarios](./overview.md#speech-scenarios). Base models are updated periodically to improve accuracy and quality. We recommend that if you use base models, use the latest default base models. If a required customization capability is only available with an older model, then you can choose an older base model.
54+
- A custom model augments the base model to include domain-specific vocabulary shared across all areas of the custom domain.
55+
- Multiple custom models can be used when the custom domain has multiple areas, each with a specific vocabulary.
56+
57+
One recommended way to see if the base model suffices is to analyze the transcription produced from the base model and compare it with a human-generated transcript for the same audio. You can compare the transcripts and obtain a [word error rate (WER)](how-to-custom-speech-evaluate-data.md#evaluate-word-error-rate-wer) score. If the WER score is high, training a custom model to recognize the incorrectly identified words is recommended.
58+
59+
Multiple models are recommended if the vocabulary varies across the domain areas. For instance, Olympic commentators report on various events, each associated with its own vernacular. Because each Olympic event vocabulary differs significantly from others, building a custom model specific to an event increases accuracy by limiting the utterance data relative to that particular event. As a result, the model doesn't need to sift through unrelated data to make a match. Regardless, training still requires a decent variety of training data. Include audio from various commentators who have different accents, gender, age, etcetera.
60+
61+
## Model stability and lifecycle
62+
63+
A base model or custom model deployed to an endpoint using custom speech is fixed until you decide to update it. The speech recognition accuracy and quality remain consistent, even when a new base model is released. This allows you to lock in the behavior of a specific model until you decide to use a newer model.
64+
65+
Whether you train your own model or use a snapshot of a base model, you can use the model for a limited time. For more information, see [Model and endpoint lifecycle](./how-to-custom-speech-model-and-endpoint-lifecycle.md).
66+
4367
## Responsible AI
4468

4569
An AI system includes not only the technology, but also the people who use it, the people who are affected by it, and the environment in which it's deployed. Read the transparency notes to learn about responsible AI use and deployment in your systems.

0 commit comments

Comments
 (0)