You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/content-understanding/video/overview.md
+5-16Lines changed: 5 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -55,12 +55,10 @@ Calling prebuilt-video with no custom schema returns a document like the followi
55
55
# Video: 00:00.000 → 00:30.000
56
56
Width: 1280 · Height: 720
57
57
58
-
## Segment 1 00:00.000 → 00:06.400
59
-
A lively gathering in a room decorated with colorful banners and balloons. Party guests watch a TV showing a sports event while a young man kneels excitedly in front. Snacks and drinks underline the festive mood.
60
-
61
-
**Transcript**
58
+
Transcript
62
59
WEBVTT
63
-
00:03.600 → 00:06.000 <1 Speaker> Get New Years ready.
60
+
00:03.600 --> 00:06.000 <1 Speaker> Get new years ready.
61
+
00:11.120 --> 00:13.520 <1 Speaker>Find your style for the new year
64
62
65
63
**Key frames**
66
64
- 00:00.600 
@@ -71,16 +69,7 @@ Calling prebuilt-video with no custom schema returns a document like the followi
71
69
- 00:05.600 
72
70
- 00:06.200 
73
71
74
-
## Segment 2 00:06.400 → 00:10.080
75
-
The room erupts into a vibrant party scene—people dancing under soccer-themed décor, flags waving, energy soaring.
76
-
77
-
**Key frames**
78
-
- 00:07.080 
79
-
- 00:07.760 
80
-
- 00:08.560 
81
-
- 00:09.360 
82
-
83
-
*…additional segments omitted for brevity…*
72
+
*…additional data omitted for brevity…*
84
73
````
85
74
86
75
## Walk-through
@@ -113,8 +102,8 @@ The first pass is all about extracting a first set of details—who's speaking,
113
102
> When Multilingual transcription is used, any files with unsupported locales produce a result based on the closest supported locale, which is likely incorrect. This result is a known
114
103
> behavior. Avoid transcription quality issues by ensuring that you configure locales when not using a multilingual transcription supported locale!
115
104
116
-
* **Shot detection:** Identifies segments of the video aligned with shot boundaries where possible, allowing for precise editing and repackaging of content with breaks exactly on shot boundaries.
117
105
* **Key frame extraction:** Extracts key frames from videos to represent each shot completely, ensuring each shot has enough key frames to enable field extraction to work effectively.
106
+
* **Shot detection:** Identifies segments of the video aligned with shot boundaries where possible, allowing for precise editing and repackaging of content with breaks exactly on shot boundaries. This is output as a
0 commit comments