You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/content-understanding/video/overview.md
+7-6Lines changed: 7 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -104,14 +104,15 @@ The service operates in two stages. The first stage, content extraction, involve
104
104
105
105
The first pass is all about extracting a first set of details—who's speaking, where are the cuts, and which faces recur. It creates a solid metadata backbone that later steps can reason over.
106
106
107
-
* **Transcription:** Converts conversational audio into searchable and analyzable text-based transcripts in WebVTT format. Sentence-level timestamps are available if `returnDetails=true` is set. Content Understanding supports the full set of Azure AI Speech speech-to-text languages. For sepecifics on the supported languages see [Language and region support](../language-region-support.md#Video) .Additionally, the following transcription details are important to consider:
107
+
* **Transcription:** Converts conversational audio into searchable and analyzable text-based transcripts in WebVTT format. Sentence-level timestamps are available if `returnDetails=true` is set. Content Understanding supports the full set of Azure AI Speech speech-to-text languages. For more information on supported languages, *see* [Language and region support](../language-region-support.md#language-support). The following transcription details are important to consider:
108
+
108
109
* **Diarization:** Distinguishes between speakers in a conversation in the output, attributing parts of the transcript to specific speakers.
109
110
* **Multilingual transcription:** Generates multilingual transcripts. Language/locale is applied per phrase in the transcript. Phrases output when `returnDetails=true` is set. Deviating from language detection this feature is enabled when no language/locale is specified or language is set to `auto`.
110
111
111
112
> [!NOTE]
112
113
> When multilingual transcription is used, a file with an unsupported locale still produces a result. This result is based on the closest locale but most likely not correct.
113
114
> This transcription behavior is known. Make sure to configure locales when not using multilingual transcription!
114
-
115
+
115
116
* **Shot detection:** Identifies segments of the video aligned with shot boundaries where possible, allowing for precise editing and repackaging of content with breaks exactly on shot boundaries.
116
117
* **Key frame extraction:** Extracts key frames from videos to represent each shot completely, ensuring each shot has enough key frames to enable field extraction to work effectively.
117
118
@@ -201,12 +202,12 @@ Content Understanding offers three ways to slice a video, letting you get the ou
201
202
Face identification description is an add-on that provides context to content extraction and field extraction using face information.
202
203
203
204
> [!NOTE]
204
-
>
205
-
> This feature is limited access and involves face identification and grouping; customers need to register for access at [Face Recognition](https://aka.ms/facerecognition). Face features incur additional cost.
205
+
>
206
+
> This feature is limited access and involves face identification and grouping; customers need to register for access at [Face Recognition](https://aka.ms/facerecognition). Face features incur added costs.
206
207
207
208
### Content extraction: grouping and identification
208
209
209
-
The face add-on enables grouping and identification as output from the content extraction section. To enable face capibilities set `enableFace=true` in the analyzer configuration.
210
+
The face add-on enables grouping and identification as output from the content extraction section. To enable face capabilities set `enableFace=true` in the analyzer configuration.
210
211
211
212
***Grouping:** Grouped faces appearing in a video to extract one representative face image for each person and provides segments where each one is present. The grouped face data is available as metadata and can be used to generate customized metadata fields when `returnDetails: true` for the analyzer.
212
213
***Identification:** Labels individuals in the video with names based on a Face API person directory. Customers can enable this feature by supplying a name for a Face API directory in the current resource in the `personDirectoryId` property of the analyzer.
@@ -257,7 +258,7 @@ See [Language and region support](../language-region-support.md).
257
258
As with all Azure AI services, review Microsoft's [Data, protection, and privacy](https://www.microsoft.com/trust-center/privacy) documentation.
258
259
259
260
> [!IMPORTANT]
260
-
>
261
+
>
261
262
> If you process **Biometric Data** (for example, enable **Face Grouping** or **Face Identification**), you must meet all notice, consent, and deletion requirements under GDPR or other applicable laws. See [Data and Privacy for Face](/legal/cognitive-services/face/data-privacy-security).
0 commit comments