Skip to content

Commit 8ada4a3

Browse files
authored
Update overview.md
Fixing a set of feedbacks
1 parent 556a9af commit 8ada4a3

File tree

1 file changed

+28
-24
lines changed
  • articles/ai-services/content-understanding/video

1 file changed

+28
-24
lines changed

articles/ai-services/content-understanding/video/overview.md

Lines changed: 28 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -31,11 +31,13 @@ The **pre-built video analyzer** outputs RAG-ready Markdown that includes:
3131

3232
This format can drop straight into a vector store to enable an agent or RAG workflows—no post-processing required.
3333

34-
From there you can **customize the analyzer** for more fine-grained control of the output. You can define custom fields, segments, or enable face identification. Customization allows you to use the full power of generative models to extract deep insights from the visual and audio details of the video. For example, customization allows you to:
34+
From there you can **customize the analyzer** for more fine-grained control of the output. You can define custom fields, segments, or enable face identification. Customization allows you to use the full power of generative models to extract deep insights from the visual and audio details of the video.
3535

36-
- Identify what products and brands are seen or mentioned in the video.
37-
- Segment a news broadcast into chapters based on the topics or news stories discussed.
38-
- Use face identification to label speakers as executives, for example, `CEO John Doe`, `CFO Jane Smith`.
36+
For example, customization allows you to:
37+
38+
- **Define custom fields:** to identify what products and brands are seen or mentioned in the video.
39+
- **Generate custom segments:** to segment a news broadcast into chapters based on the topics or news stories discussed.
40+
- **Identify people using a person directory** enabling a customer to label conference speakers in footage using face identification, for example, `CEO John Doe`, `CFO Jane Smith`.
3941

4042
## Why use Content Understanding for video?
4143

@@ -49,50 +51,52 @@ Content understanding for video has broad potential uses. For example, you can c
4951
## Prebuilt video analyzer example
5052

5153
With the prebuilt video analyzer (prebuilt-videoAnalyzer), you can upload a video and get an immediately usable knowledge asset. The service packages every clip into both richly formatted Markdown and JSON. This process allows your search index or chat agent to ingest without custom glue code.
52-
Calling prebuilt-video with no custom schema returns a document like the following (abridged) example:
5354

54-
```markdown
55+
For example, creating the base prebuilt-videoAnalyzer like this:
56+
57+
```
58+
{
59+
"config": {},
60+
"BaseAnalyzerId": "prebuilt-videoAnalyzer",
61+
}
62+
```
63+
64+
Then analyzing a 30-second advertising video, would result in the following output:
65+
66+
````markdown
5567
# Video: 00:00.000 => 00:30.000
5668
Width: 1280
5769
Height: 720
5870

5971
## Segment 1: 00:00.000 => 00:06.000
60-
A lively room filled with people is shown, where a group of friends is gathered around a television. They are watching a sports event, possibly a football match, as indicated by the decorations and the atmosphere. The AliExpress logo is prominently displayed, suggesting a connection to the ongoing event.
72+
A lively room filled with people is shown, where a group of friends is gathered around a television. They are watching a sports event, possibly a football match, as indicated by the decorations and the atmosphere.
6173

6274
Transcript
6375
```
6476
WEBVTT
6577

6678
00:03.600 --> 00:06.000
67-
<Speaker 1 Speaker>Get Euro ready with AliExpress.
79+
<Speaker 1 Speaker>Get new years ready.
6880
```
6981

7082
Key Frames
7183
- 00:00.600 ![](keyFrame.600.jpg)
7284
- 00:01.200 ![](keyFrame.1200.jpg)
73-
- 00:02.560 ![](keyFrame.2560.jpg)
74-
- 00:03.280 ![](keyFrame.3280.jpg)
75-
- 00:04.560 ![](keyFrame.4560.jpg)
76-
- 00:05.600 ![](keyFrame.5600.jpg)
7785

7886
## Segment 2: 00:06.000 => 00:10.080
79-
The scene transitions to a more vibrant and energetic setting, where the group of friends is now celebrating. The room is decorated with football-themed items, and everyone is cheering and enjoying the moment. The AliExpress branding continues to be visible, emphasizing the theme of shopping and celebration.
87+
The scene transitions to a more vibrant and energetic setting, where the group of friends is now celebrating. The room is decorated with football-themed items, and everyone is cheering and enjoying the moment.
8088

8189
Transcript
8290
```
8391
WEBVTT
8492

8593
00:03.600 --> 00:06.000
86-
<Speaker 1 Speaker>Get Euro ready with AliExpress.
94+
<Speaker 1 Speaker>Go team!
8795
```
8896

8997
Key Frames
9098
- 00:06.200 ![](keyFrame.6200.jpg)
9199
- 00:07.080 ![](keyFrame.7080.jpg)
92-
- 00:07.760 ![](keyFrame.7760.jpg)
93-
- 00:08.560 ![](keyFrame.8560.jpg)
94-
- 00:09.360 ![](keyFrame.9360.jpg)
95-
96100

97101
*…additional data omitted for brevity…*
98102
````
@@ -102,7 +106,7 @@ Key Frames
102106
We recently published a walk-through for RAG on Video using Content Understanding.
103107
[https://www.youtube.com/watch?v=fafneWnT2kw\&lc=Ugy2XXFsSlm7PgIsWQt4AaABAg](https://www.youtube.com/watch?v=fafneWnT2kw&lc=Ugy2XXFsSlm7PgIsWQt4AaABAg)
104108

105-
## Capabilities
109+
# Capabilities
106110

107111
1. [Content extraction](#content-extraction-capabilities)
108112
1. [Field extraction](#field-extraction-and-segmentation)
@@ -178,6 +182,10 @@ Shape the output to match your business vocabulary. Use a `fieldSchema` object w
178182

179183

180184
### Segmentation mode
185+
> [!NOTE]
186+
>
187+
> Setting segmentation triggers field extraction even if no fields are defined.
188+
181189

182190
Content Understanding offers three ways to slice a video, letting you get the output you need for whole videos or short clips. You can use these options by setting the `SegmentationMode` property on a custom analyzer.
183191

@@ -206,11 +214,7 @@ Content Understanding offers three ways to slice a video, letting you get the ou
206214
"segmentationDefinition": "news broadcasts divided by individual stories"
207215
}
208216
```
209-
210-
> [!NOTE]
211-
>
212-
> Setting segmentation triggers field extraction even if no fields are defined.
213-
217+
214218
## Face identification description add-on
215219

216220
> [!NOTE]

0 commit comments

Comments
 (0)