Skip to content

Commit 4096064

Browse files
authored
Merge pull request #207326 from eric-urban/eur/embed-video-stt
embed show recording
2 parents 9cc9e73 + 5983d35 commit 4096064

File tree

1 file changed

+8
-14
lines changed

1 file changed

+8
-14
lines changed

articles/cognitive-services/Speech-Service/speech-to-text.md

Lines changed: 8 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -25,33 +25,27 @@ Speech-to-text, also known as speech recognition, enables real-time or offline t
2525
2626
## Get started
2727

28-
To get started with speech-to-text, see the [quickstart](get-started-speech-to-text.md). Speech-to-text is available via the [Speech SDK](speech-sdk.md), the [REST API](rest-speech-to-text.md), and the [Speech CLI](spx-overview.md).
28+
To get started, try the [speech-to-text quickstart](get-started-speech-to-text.md). Speech-to-text is available via the [Speech SDK](speech-sdk.md), the [REST API](rest-speech-to-text.md), and the [Speech CLI](spx-overview.md).
2929

30-
Sample code for the Speech SDK is available on GitHub. These samples cover common scenarios like reading audio from a file or stream for continuous and single-shot recognition, and working with custom models:
30+
The following video shows how to install the [Speech SDK for C#](quickstarts/setup-platform.md) and write a simple .NET console application for speech-to-text.
31+
32+
> [!VIDEO c20d3b0c-e96a-4154-9299-155e27db7117]
33+
34+
In depth samples are available in the [Azure-Samples/cognitive-services-speech-sdk](https://aka.ms/csspeech/samples) repository on GitHub. There are samples for C# (including UWP, Unity, and Xamarin), C++, Java, JavaScript (including Browser and Node.js), Objective-C, Python, and Swift. Code samples for Go are available in the [Microsoft/cognitive-services-speech-sdk-go](https://github.com/Microsoft/cognitive-services-speech-sdk-go) repository on GitHub.
3135

32-
- [Speech-to-text samples (SDK)](https://github.com/Azure-Samples/cognitive-services-speech-sdk)
33-
- [Batch transcription samples (REST)](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/batch)
3436

3537
## Batch transcription
3638

37-
Batch transcription is a set of REST API operations that enable you to transcribe a large amount of audio in storage. You can point to audio files with a shared access signature (SAS) URI and asynchronously receive transcription results. For more information on how to use the batch transcription API, see [How to use batch transcription](batch-transcription.md).
39+
Batch transcription is a set of [Speech-to-text REST API v3.0](rest-speech-to-text.md) operations that enable you to transcribe a large amount of audio in storage. You can point to audio files with a shared access signature (SAS) URI and asynchronously receive transcription results. For more information on how to use the batch transcription API, see [How to use batch transcription](batch-transcription.md) and [Batch transcription samples (REST)](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/batch).
3840

3941
## Custom Speech
4042

4143
The Azure speech-to-text service analyzes audio in real-time or batch to transcribe the spoken word into text. Out of the box, speech to text utilizes a Universal Language Model as a base model that is trained with Microsoft-owned data and reflects commonly used spoken language. This base model is pre-trained with dialects and phonetics representing a variety of common domains. The base model works well in most scenarios.
4244

43-
The base model may not be sufficient if the audio contains ambient noise or includes a lot of industry and domain-specific jargon. In these cases, building a custom speech model makes sense by training with additional data associated with that specific domain. You can create and train custom acoustic, language, and pronunciation models. For more information, see [Custom Speech](./custom-speech-overview.md).
45+
The base model may not be sufficient if the audio contains ambient noise or includes a lot of industry and domain-specific jargon. In these cases, building a custom speech model makes sense by training with additional data associated with that specific domain. You can create and train custom acoustic, language, and pronunciation models. For more information, see [Custom Speech](./custom-speech-overview.md) and [Speech-to-text REST API v3.0](rest-speech-to-text.md).
4446

4547
Customization options vary by language or locale. To verify support, see [Language and voice support for the Speech service](./language-support.md).
4648

47-
### REST API
48-
49-
In some cases, you can't or shouldn't use the [Speech SDK](speech-sdk.md). For speech-to-text REST APIs, see the following documentation:
50-
51-
- [Speech-to-text REST API v3.0](rest-speech-to-text.md): You should use the REST API for [batch transcription](batch-transcription.md) and [Custom Speech](custom-speech-overview.md).
52-
- [Speech-to-text REST API for short audio](rest-speech-to-text-short.md): Use it only in cases where you can't use the [Speech SDK](speech-sdk.md).
53-
54-
5549
## Next steps
5650

5751
- [Get started with speech-to-text](get-started-speech-to-text.md)

0 commit comments

Comments
 (0)