Skip to content

Commit 01a2c76

Browse files
authored
Merge pull request #107970 from IEvangelist/tocAgain
Updates from working with Chris
2 parents a12d2ce + 66ad50e commit 01a2c76

File tree

3 files changed

+26
-21
lines changed

3 files changed

+26
-21
lines changed

articles/cognitive-services/Speech-Service/batch-transcription.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: How to use batch transcription - Speech service
2+
title: What is batch transcription - Speech service
33
titleSuffix: Azure Cognitive Services
44
description: Batch transcription is ideal if you want to transcribe a large quantity of audio in storage, such as Azure Blobs. By using the dedicated REST API, you can point to audio files with a shared access signature (SAS) URI and asynchronously receive transcriptions.
55
services: cognitive-services
@@ -8,11 +8,11 @@ manager: nitinme
88
ms.service: cognitive-services
99
ms.subservice: speech-service
1010
ms.topic: conceptual
11-
ms.date: 03/16/2020
11+
ms.date: 03/17/2020
1212
ms.author: panosper
1313
---
1414

15-
# How to use batch transcription
15+
# What is batch transcription?
1616

1717
Batch transcription is ideal for transcribing a large amount of audio in storage. By using the dedicated REST API, you can point to audio files with a shared access signature (SAS) URI and asynchronously receive transcription results.
1818

@@ -48,11 +48,11 @@ If you plan to customize acoustic or language models, follow the steps in [Custo
4848

4949
The Batch Transcription API supports the following formats:
5050

51-
| Format | Codec | Bitrate | Sample Rate |
52-
|--------|-------|---------|-------------|
53-
| WAV | PCM | 16-bit | 8 kHz or 16 kHz, mono or stereo |
54-
| MP3 | PCM | 16-bit | 8 kHz or 16 kHz, mono or stereo |
55-
| OGG | OPUS | 16-bit | 8 kHz or 16 kHz, mono or stereo |
51+
| Format | Codec | Bitrate | Sample Rate |
52+
|--------|-------|---------|---------------------------------|
53+
| WAV | PCM | 16-bit | 8 kHz or 16 kHz, mono or stereo |
54+
| MP3 | PCM | 16-bit | 8 kHz or 16 kHz, mono or stereo |
55+
| OGG | OPUS | 16-bit | 8 kHz or 16 kHz, mono or stereo |
5656

5757
For stereo audio streams, the left and right channels are split during the transcription. For each channel, a JSON result file is being created. The timestamps generated per utterance enable the developer to create an ordered final transcript.
5858

@@ -142,7 +142,7 @@ For mono input audio, one transcription result file is being created. For stereo
142142

143143
```json
144144
{
145-
"AudioFileResults":[
145+
"AudioFileResults":[
146146
{
147147
"AudioFileName": "Channel.0.wav | Channel.1.wav" 'maximum of 2 channels supported'
148148
"AudioFileUrl": null 'always null'
@@ -204,12 +204,12 @@ For mono input audio, one transcription result file is being created. For stereo
204204

205205
The result contains these forms:
206206

207-
|Form|Content|
208-
|-|-|
209-
|`Lexical`|The actual words recognized.
210-
|`ITN`|Inverse-text-normalized form of the recognized text. Abbreviations ("doctor smith" to "dr smith"), phone numbers, and other transformations are applied.
211-
|`MaskedITN`|The ITN form with profanity masking applied.
212-
|`Display`|The display form of the recognized text. This includes added punctuation and capitalization.
207+
| Form | Content |
208+
|-------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|
209+
| `Lexical` | The actual words recognized. |
210+
| `ITN` | Inverse-text-normalized form of the recognized text. Abbreviations ("doctor smith" to "dr smith"), phone numbers, and other transformations are applied. |
211+
| `MaskedITN` | The ITN form with profanity masking applied. |
212+
| `Display` | The display form of the recognized text. This includes added punctuation and capitalization. |
213213

214214
## Speaker separation (Diarization)
215215

articles/cognitive-services/Speech-Service/index-speech-to-text.yml

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
### YamlMime:Landing
22

33
title: Speech-to-text documentation
4-
summary: Speech-to-text from the Speech service, also known as speech recognition, enables real-time transcription of audio streams into text.
4+
summary: Speech-to-text from the Speech service, also known as speech recognition, enables real-time and batch transcription of audio streams into text.
55
metadata:
66
title: Speech-to-text documentation - Tutorials, API Reference - Azure Cognitive Services | Microsoft Docs
77
titleSuffix: Azure Cognitive Services
@@ -10,16 +10,20 @@ metadata:
1010
manager: nitinme
1111
ms.service: speech-service
1212
ms.topic: landing-page
13-
ms.date: 03/10/2020
13+
ms.date: 03/17/2020
1414
ms.author: dapine
1515

1616
landingContent:
1717
- title: About speech-to-text
1818
linkLists:
1919
- linkListType: overview
2020
links:
21-
- text: What is speech-to-text?
21+
- text: What is real-time speech-to-text?
2222
url: speech-to-text.md
23+
- text: What is batch speech-to-text?
24+
url: batch-transcription.md
25+
- text: Speech recognition basics?
26+
url: speech-to-text-basics.md
2327
- linkListType: quickstart
2428
links:
2529
- text: Recognize speech with microphone input

articles/cognitive-services/Speech-Service/toc.yml

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,10 @@
1818
href: index-speech-to-text.yml
1919
- name: Overview
2020
items:
21-
- name: What is speech-to-text?
21+
- name: What is real-time speech-to-text?
2222
href: speech-to-text.md
23+
- name: What is batch speech-to-text?
24+
href: batch-transcription.md
2325
- name: Speech recognition basics
2426
href: speech-to-text-basics.md
2527
- name: Quickstart
@@ -40,6 +42,7 @@
4042
href: how-to-custom-speech.md
4143
- name: Use compressed audio input formats
4244
href: how-to-use-codec-compressed-audio-input-streams.md
45+
displayName: codec,codecs,compression,compressed,mp3,flac,mulaw,alaw,mp4,mp4a,wav,opus,ogg,pcm,silk
4346
- name: Improve accuracy with Phrase Lists
4447
href: how-to-phrase-lists.md
4548
- name: Improve accuracy with tenant models
@@ -95,8 +98,6 @@
9598
items:
9699
- name: Speech-to-text REST API
97100
href: rest-speech-to-text.md
98-
- name: Batch transcription REST API
99-
href: batch-transcription.md
100101
- name: Resources
101102
items:
102103
- name: Language support

0 commit comments

Comments
 (0)