Skip to content

Commit 5cac72c

Browse files
committed
STT overview refresh
1 parent 789a484 commit 5cac72c

File tree

1 file changed

+16
-16
lines changed

1 file changed

+16
-16
lines changed

articles/ai-services/speech-service/speech-to-text.md

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -25,12 +25,12 @@ The speech to text service offers the following core features:
2525
## Real-time speech to text
2626

2727
Real-time speech to text transcribes audio as it's recognized from a microphone or file. It's ideal for applications requiring immediate transcription, such as:
28-
- Transcriptions, captions, or subtitles for live meetings: Real-time audio transcription for accessibility and record-keeping.
29-
- Diarization: Identifying and distinguishing between different speakers in the audio.
30-
- Pronunciation assessment: Evaluating and providing feedback on pronunciation accuracy.
31-
- Call center agents assist: Providing real-time transcription to assist customer service representatives.
32-
- Dictation: Transcribing spoken words into written text for documentation purposes.
33-
- Voice agents: Enabling interactive voice response systems to transcribe user queries and commands.
28+
- **Transcriptions, captions, or subtitles for live meetings**: Real-time audio transcription for accessibility and record-keeping.
29+
- **Diarization**: Identifying and distinguishing between different speakers in the audio.
30+
- **Pronunciation assessment**: Evaluating and providing feedback on pronunciation accuracy.
31+
- **Call center agents assist**: Providing real-time transcription to assist customer service representatives.
32+
- **Dictation**: Transcribing spoken words into written text for documentation purposes.
33+
- **Voice agents**: Enabling interactive voice response systems to transcribe user queries and commands.
3434

3535
Real-time speech to text can be accessed via the Speech SDK, Speech CLI, and REST API, allowing integration into various applications and workflows.
3636
Real-time speech to text is available via the [Speech SDK](speech-sdk.md), the [Speech CLI](spx-overview.md), and REST APIs such as the [Fast transcription API](fast-transcription-create.md).
@@ -39,8 +39,8 @@ Real-time speech to text is available via the [Speech SDK](speech-sdk.md), the [
3939

4040
Fast transcription API is used to transcribe audio files with returning results synchronously and faster than real-time audio. Use fast transcription in the scenarios that you need the transcript of an audio recording as quickly as possible with predictable latency, such as:
4141

42-
- Quick audio or video transcription and subtitles: Quickly get a transcription of an entire video or audio file in one go.
43-
- Video translation: Immediately get new subtitles for a video if you have audio in different languages.
42+
- **Quick audio or video transcription and subtitles**: Quickly get a transcription of an entire video or audio file in one go.
43+
- **Video translation**: Immediately get new subtitles for a video if you have audio in different languages.
4444

4545
> [!NOTE]
4646
> Fast transcription API is only available via the speech to text REST API version 2024-05-15-preview and later.
@@ -50,9 +50,9 @@ To get started with fast transcription, see [use the fast transcription API (pre
5050
## Batch transcription API
5151

5252
[Batch transcription](batch-transcription.md) is designed for transcribing large amounts of audio stored in files. This method processes audio asynchronously and is suited for:
53-
- Transcriptions, captions, or subtitles for prerecorded audio: Converting stored audio content into text.
54-
- Contact center post-call analytics: Analyzing recorded calls to extract valuable insights.
55-
- Diarization: Differentiating between speakers in recorded audio.
53+
- **Transcriptions, captions, or subtitles for prerecorded audio**: Converting stored audio content into text.
54+
- **Contact center post-call analytics**: Analyzing recorded calls to extract valuable insights.
55+
- **Diarization**: Differentiating between speakers in recorded audio.
5656

5757
Batch transcription is available via:
5858
- [Speech to text REST API](rest-speech-to-text.md): Facilitates batch processing with the flexibility of RESTful calls. To get started, see [How to use batch transcription](batch-transcription.md) and [Batch transcription samples](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/batch).
@@ -72,8 +72,8 @@ With [custom speech](./custom-speech-overview.md), you can evaluate and improve
7272
Out of the box, speech recognition utilizes a Universal Language Model as a base model that is trained with Microsoft-owned data and reflects commonly used spoken language. The base model is pretrained with dialects and phonetics representing various common domains. When you make a speech recognition request, the most recent base model for each [supported language](language-support.md?tabs=stt) is used by default. The base model works well in most speech recognition scenarios.
7373
7474
Custom speech allows you to tailor the speech recognition model to better suit your application's specific needs. This can be particularly useful for:
75-
- Improving recognition of domain-specific vocabulary: Train the model with text data relevant to your field.
76-
- Enhancing accuracy for specific audio conditions: Use audio data with reference transcriptions to refine the model.
75+
- **Improving recognition of domain-specific vocabulary**: Train the model with text data relevant to your field.
76+
- **Enhancing accuracy for specific audio conditions**: Use audio data with reference transcriptions to refine the model.
7777
7878
For more information about custom speech, see the [custom speech overview](./custom-speech-overview.md) and the [speech to text REST API](rest-speech-to-text.md) documentation.
7979
@@ -87,9 +87,9 @@ Here are some practical examples of how you can utilize Azure AI speech to text:
8787
| --- | --- | --- |
8888
| **Live meeting transcriptions and captions** | A virtual event platform needs to provide real-time captions for webinars. | Integrate real-time speech to text using the Speech SDK to transcribe spoken content into captions displayed live during the event. |
8989
| **Customer service enhancement** | A call center wants to assist agents by providing real-time transcriptions of customer calls. | Use real-time speech to text via the Speech CLI to transcribe calls, enabling agents to better understand and respond to customer queries. |
90-
| **Video subtitling** | A video-hosting platform wants to quickly generate a set of subtitles for a video. | Implement Fast Transcription to quickly get a set of subtitles for the entire video. |
91-
| **Educational tools** | An e-learning platform aims to provide transcriptions for video lectures. | Apply batch transcription through the Speech to text REST API to process prerecorded lecture videos, generating text transcripts for students. |
92-
| **Healthcare documentation** | A healthcare provider needs to document patient consultations. | Implement real-time speech to text for dictation, allowing healthcare professionals to speak their notes and have them transcribed instantly. Use a custom model to enhance recognition on specific medical terms. |
90+
| **Video subtitling** | A video-hosting platform wants to quickly generate a set of subtitles for a video. | Use fast transcription to quickly get a set of subtitles for the entire video. |
91+
| **Educational tools** | An e-learning platform aims to provide transcriptions for video lectures. | Apply batch transcription through the speech to text REST API to process prerecorded lecture videos, generating text transcripts for students. |
92+
| **Healthcare documentation** | A healthcare provider needs to document patient consultations. | Use real-time speech to text for dictation, allowing healthcare professionals to speak their notes and have them transcribed instantly. Use a custom model to enhance recognition of specific medical terms. |
9393
| **Media and entertainment** | A media company wants to create subtitles for a large archive of videos. | Use batch transcription to process the video files in bulk, generating accurate subtitles for each video. |
9494
| **Market research** | A market research firm needs to analyze customer feedback from audio recordings. | Employ batch transcription to convert audio feedback into text, enabling easier analysis and insights extraction. |
9595

0 commit comments

Comments
 (0)