Merge pull request #1549 from eric-urban/eur/speech-fast-follow

AnnaMHuff · web-flow · commit 64ad868c5fa1 · 2024-11-19T08:46:14.000-07:00
Azure AI Speech fast follow updates
diff --git a/articles/ai-services/speech-service/includes/language-support/stt.md b/articles/ai-services/speech-service/includes/language-support/stt.md
@@ -74,7 +74,7 @@ ms.author: eur
 | `es-PR` | Spanish (Puerto Rico) | No | Audio + human-labeled transcript<br/><br/>Plain text<br/><br/>Structured text<br/><br/>Pronunciation |
 | `es-PY` | Spanish (Paraguay) | No | Audio + human-labeled transcript<br/><br/>Plain text<br/><br/>Structured text<br/><br/>Pronunciation |
 | `es-SV` | Spanish (El Salvador) | No | Audio + human-labeled transcript<br/><br/>Plain text<br/><br/>Structured text<br/><br/>Pronunciation |
-| `es-US` | Spanish (United States) | No | Plain text<br/><br/>Structured text<br/><br/>Pronunciation<br/><br/>Phrase list |
+| `es-US` | Spanish (United States)<sup>1</sup> | No | Plain text<br/><br/>Structured text<br/><br/>Pronunciation<br/><br/>Phrase list |
 | `es-UY` | Spanish (Uruguay) | No | Audio + human-labeled transcript<br/><br/>Plain text<br/><br/>Structured text<br/><br/>Pronunciation |
 | `es-VE` | Spanish (Venezuela) | No | Plain text<br/><br/>Structured text<br/><br/>Pronunciation |
 | `et-EE` | Estonian (Estonia) | No | Plain text<br/><br/>Pronunciation |
@@ -83,7 +83,7 @@ ms.author: eur
 | `fi-FI` | Finnish (Finland) | No | Audio + human-labeled transcript<br/><br/>Plain text<br/><br/>Structured text<br/><br/>Output format<br/><br/>Pronunciation |
 | `fil-PH` | Filipino (Philippines) | No | Plain text<br/><br/>Pronunciation |
 | `fr-BE` | French (Belgium) | No | Plain text |
-| `fr-CA` | French (Canada) | No | Plain text<br/><br/>Structured text<br/><br/>Output format<br/><br/>Pronunciation<br/><br/>Phrase list |
+| `fr-CA` | French (Canada)<sup>1</sup> | No | Plain text<br/><br/>Structured text<br/><br/>Output format<br/><br/>Pronunciation<br/><br/>Phrase list |
 | `fr-CH` | French (Switzerland) | No | Plain text<br/><br/>Pronunciation |
 | `fr-FR` | French (France) | Yes | Audio + human-labeled transcript<br/><br/>Plain text<br/><br/>Structured text<br/><br/>Output format<br/><br/>Pronunciation<br/><br/>Phrase list |
 | `ga-IE` | Irish (Ireland) | No | Plain text<br/><br/>Pronunciation |
diff --git a/articles/ai-services/speech-service/includes/release-notes/release-notes-stt.md b/articles/ai-services/speech-service/includes/release-notes/release-notes-stt.md
@@ -2,11 +2,10 @@
 author: eric-urban
 ms.service: azure-ai-speech
 ms.topic: include
-ms.date: 11/12/2024
+ms.date: 11/19/2024
 ms.author: eur
 ---
 
-
 ### November 2024 release
 
 #### Speech to text REST API version 2024-11-15
@@ -22,6 +21,10 @@ Fast transcription is now generally available via [speech to text REST API versi
 
 ### October 2024 release
 
+#### Real-time speech to text (bilingual)
+
+Significant improvements have been made the recognition quality of short Spanish terms via the `es-US` bilingual models. The model is bilingual and also supports English. The quality of English recognition is also improved.
+
 #### Video translation (Preview)
 
 The video translation API is now available in public preview. For more information, see the [How to use video translation](../../video-translation-get-started.md?pivots=rest-api).
diff --git a/articles/ai-services/speech-service/meeting-transcription.md b/articles/ai-services/speech-service/meeting-transcription.md
@@ -29,6 +29,7 @@ Conversation transcription multichannel diarization (preview) is retiring on Mar
 To continue using speech to text with diarization, use the following features instead:
 
 - [Real-time speech to text with diarization](get-started-stt-diarization.md)
+- [Fast transcription with diarization](fast-transcription-create.md)
 - [Batch transcription with diarization](batch-transcription.md)
 
 These speech to text features only support diarization for single-channel audio. Multichannel audio that you used with conversation transcription multichannel diarization isn't supported. 
diff --git a/articles/ai-services/speech-service/overview.md b/articles/ai-services/speech-service/overview.md
@@ -59,17 +59,14 @@ With [real-time speech to text](get-started-speech-to-text.md), the audio is tra
 - Dictation
 - Voice agents
 
-## Fast transcription API (Preview)
+## Fast transcription API
 
 Fast transcription API is used to transcribe audio files with returning results synchronously and much faster than real-time audio. Use fast transcription in the scenarios that you need the transcript of an audio recording as quickly as possible with predictable latency, such as: 
 
 - Quick audio or video transcription, subtitles, and edit. 
 - Video translation 
 
-> [!NOTE]
-> Fast transcription API is only available via the speech to text REST API version 2024-05-15-preview. 
-
-To get started with fast transcription, see [use the fast transcription API (preview)](fast-transcription-create.md).
+To get started with fast transcription, see [use the fast transcription API](fast-transcription-create.md).
 
 ### Batch transcription
 
diff --git a/articles/ai-services/speech-service/speech-to-text.md b/articles/ai-services/speech-service/speech-to-text.md
@@ -19,7 +19,7 @@ Azure AI Speech service offers advanced speech to text capabilities. This featur
 
 The speech to text service offers the following core features: 
 - [Real-time](#real-time-speech-to-text) transcription: Instant transcription with intermediate results for live audio inputs. 
-- [Fast transcription](#fast-transcription-preview): Fastest synchronous output for situations with predictable latency. 
+- [Fast transcription](#fast-transcription): Fastest synchronous output for situations with predictable latency. 
 - [Batch transcription](#batch-transcription-api): Efficient processing for large volumes of prerecorded audio. 
 - [Custom speech](#custom-speech): Models with enhanced accuracy for specific domains and conditions. 
 
@@ -36,17 +36,14 @@ Real-time speech to text transcribes audio as it's recognized from a microphone
 Real-time speech to text can be accessed via the Speech SDK, Speech CLI, and REST API, allowing integration into various applications and workflows. 
 Real-time speech to text is available via the [Speech SDK](speech-sdk.md), the [Speech CLI](spx-overview.md), and REST APIs such as the [Fast transcription API](fast-transcription-create.md). 
 
-## Fast transcription (Preview)
+## Fast transcription
 
 Fast transcription API is used to transcribe audio files with returning results synchronously and faster than real-time audio. Use fast transcription in the scenarios that you need the transcript of an audio recording as quickly as possible with predictable latency, such as: 
 
 - **Quick audio or video transcription and subtitles**: Quickly get a transcription of an entire video or audio file in one go.
 - **Video translation**: Immediately get new subtitles for a video if you have audio in different languages. 
 
-> [!NOTE]
-> Fast transcription API is only available via the speech to text REST API version 2024-05-15-preview and later. 
-
-To get started with fast transcription, see [use the fast transcription API (preview)](fast-transcription-create.md).
+To get started with fast transcription, see [use the fast transcription API](fast-transcription-create.md).
 
 ## Batch transcription API
 
diff --git a/articles/ai-services/speech-service/speech-translation.md b/articles/ai-services/speech-service/speech-translation.md
@@ -29,7 +29,7 @@ The core features of speech translation include:
 
 - [Speech to text translation](#speech-to-text-translation)
 - [Speech to speech translation](#speech-to-speech-translation)
-- [Multi-lingual speech translation](#multi-lingual-speech-translation-preview)
+- [Multi-lingual speech translation](#multi-lingual-speech-translation)
 - [Multiple target languages translation](#multiple-target-languages-translation)
 
 ## Speech to text translation
@@ -40,7 +40,7 @@ The standard feature offered by the Speech service is the ability to take in an
 
 As a supplement to the above feature, the Speech service also offers the option to read aloud the translated text using our large database of pretrained voices, allowing for a natural output of the input speech. 
 
-## Multi-lingual speech translation (Preview)
+## Multi-lingual speech translation
 
 Multi-lingual speech translation implements a new level of speech translation technology that unlocks various capabilities, including having no specified input language, handling language switches within the same session, and supporting live streaming translations into English. These features enable a new level of speech translation powers that can be implemented into your products. 
 
@@ -53,9 +53,7 @@ Some use cases for multi-lingual speech translation include:
 - Travel Interpreter. When traveling abroad, multi-lingual speech translation offers the ability to create a solution that allows customers to translate any input audio to and from the local language. This allows them to communicate with the locals and better understand their surroundings. 
 - Business Meeting. In a meeting with people who speak different languages, multi-lingual speech translation allows the members of the meeting to all communicate with each other naturally as if there was no language barrier. 
 
-For multi-lingual speech translation, these are the languages the Speech service can automatically detect and switch between from the input: Arabic (ar), Basque (eu), Bosnian (bs), Bulgarian (bg), Chinese Simplified (zh), Chinese Traditional (zhh), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Finnish (fi), French (fr), Galician (gl), German (de), Greek (el), Hindi (hi), Hungarian (hu), Indonesian (id), Italian (it), Japanese (ja), Korean (ko), Latvian (lv), Lithuanian (lt), Macedonian (mk), Norwegian (nb), Polish (pl), Portuguese (pt), Romanian (ro), Russian (ru), Serbian (sr), Slovak (sk), Slovenian (sl), Spanish (es), Swedish (sv), Thai (th), Turkish (tr), Ukrainian (uk), Vietnamese (vi), and Welsh (cy).
-
-For a list of the supported output (target) languages, see the *Translate to text language* table in the [language and voice support documentation](language-support.md?tabs=speech-translation).
+For a list of the supported input (source) languages, see the [speech to text languages documentation](language-support.md?tabs=stt). For a list of the supported output (target) languages, see the *Translate to text language* table in the [speech translation languages documentation](language-support.md?tabs=speech-translation).
 
 For more information on multi-lingual speech translation, see [the speech translation how to guide](./how-to-translate-speech.md#multi-lingual-speech-translation-without-source-language-candidates) and [speech translation samples on GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/csharp/sharedcontent/console/translation_samples.cs#L472).