Skip to content

Commit cd22dab

Browse files
Update batch-transcription.md
update with diarization info
1 parent 8623020 commit cd22dab

File tree

1 file changed

+34
-0
lines changed

1 file changed

+34
-0
lines changed

articles/cognitive-services/Speech-Service/batch-transcription.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,40 @@ Polling for transcription status may not be the most performant, or provide the
9797

9898
For more details, see [Webhooks](webhooks.md).
9999

100+
## Speaker Separation (Diarization)
101+
102+
Diarization is the process of separating speakers in a piece of audio. Our Batch pipeline supports Diarization and is capable of recognizing 2 speakers on mono channel recordings.
103+
104+
To request that your audio transcription request is processed for diarization you simply have to add the relevant parameter in the HTTP request as shown below.
105+
106+
```json
107+
{
108+
"recordingsUrl": "<URL to the Azure blob to transcribe>",
109+
"models": [{"Id":"<optional acoustic model ID>"},{"Id":"<optional language model ID>"}],
110+
"locale": "<locale to us, for example en-US>",
111+
"name": "<user defined name of the transcription batch>",
112+
"description": "<optional description of the transcription>",
113+
"properties": {
114+
"AddWordLevelTimestamps" : "True",
115+
"AddDiarization" : "True"
116+
}
117+
}
118+
```
119+
120+
Note that world level timestamps would also have to be 'turned on' as the parameters in the above request indicate.
121+
122+
The corresponding audio will contain the speakers identified by a number (currently we support only 2 voices, so the speakers will be identified as 'Speaker 1 'and 'Speaker 2') followed by the transcription output.
123+
124+
Also note that Diarization is not available in Stereo recordings. Furthermore, all JSON output will contain the Speaker tag. If diarization is not used it will simply show as 'Speaker: Null'
125+
126+
Supported locales are listed below.
127+
128+
| Language | locale |
129+
|--------|-------|
130+
| English | en-US |
131+
| Chinese | zh-CN |
132+
| Deutsch | de-DE |
133+
100134
## Sentiment
101135

102136
Sentiment is a new feature in Batch Transcription API and is an important feature in the call center domain. Customers can use the `AddSentiment` parameters to their requests to

0 commit comments

Comments
 (0)