You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/about/concepts/video/abstractions.md
+9-7Lines changed: 9 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -33,13 +33,15 @@ A pipeline orchestrates stages into an end-to-end workflow. Key characteristics:
33
33
34
34
## Stages
35
35
36
-
A stage represents a single step in your data curation workflow. For example, stages can:
37
-
38
-
- Download videos
39
-
- Convert video formats
40
-
- Split videos into clips
41
-
- Generate embeddings
42
-
- Calculate scores
36
+
A stage represents a single step in your data curation workflow. Video stages are organized into several functional categories:
37
+
38
+
-**Input/Output**: Read video files and write processed outputs to storage ([Save & Export Documentation](video-save-export))
39
+
-**Video Clipping**: Split videos into clips using fixed stride or scene-change detection ([Video Clipping Documentation](video-process-clipping))
40
+
-**Frame Extraction**: Extract frames from videos or clips for analysis and embeddings ([Frame Extraction Documentation](video-process-frame-extraction))
41
+
-**Embedding Generation**: Generate clip-level embeddings using InternVideo2 or Cosmos-Embed1 models ([Embeddings Documentation](video-process-embeddings))
42
+
-**Filtering**: Filter clips based on motion analysis and aesthetic quality scores ([Filtering Documentation](video-process-filtering))
43
+
-**Caption and Preview**: Generate captions and preview images from video clips ([Captions & Preview Documentation](video-process-captions-preview))
44
+
-**Deduplication**: Remove near-duplicate clips using embedding-based clustering ([Duplicate Removal Documentation](video-process-dedup))
Copy file name to clipboardExpand all lines: docs/get-started/video.md
+17-21Lines changed: 17 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -117,32 +117,28 @@ Embeddings convert each video clip into a numeric vector that captures visual an
117
117
118
118
You can choose between two embedding models:
119
119
120
-
-**Cosmos-Embed1 (default)**: Automatically downloaded to `MODEL_DIR` on first run; good general-purpose performance and lower VRAM usage.
121
-
-**InternVideo2 (IV2)**: Open model that requires the IV2 checkpoint and BERT model files to be available locally; higher VRAM usage.
120
+
-**Cosmos-Embed1 (default)**: Available in three variants—**cosmos-embed1-224p**, **cosmos-embed1-336p**, and **cosmos-embed1-448p**—which differ in input resolution and accuracy/VRAM tradeoff. All variants are automatically downloaded to `MODEL_DIR` on first run.
121
+
-[cosmos-embed1-224p on Hugging Face](https://huggingface.co/nvidia/Cosmos-Embed1-224p)
122
+
-[cosmos-embed1-336p on Hugging Face](https://huggingface.co/nvidia/Cosmos-Embed1-336p)
123
+
-[cosmos-embed1-448p on Hugging Face](https://huggingface.co/nvidia/Cosmos-Embed1-448p)
124
+
-**InternVideo2 (IV2)**: Open model that requires the IV2 checkpoint and BERT model files to be available locally; higher VRAM usage.
125
+
-[InternVideo Official Github Page](https://github.com/OpenGVLab/InternVideo)
122
126
123
-
For this quickstart, we're going to set up support for **IV2**.
127
+
For this quickstart, we're going to set up support for **Cosmos-Embed1-224p**.
124
128
125
-
### Prepare IV2 Model Weights
129
+
### Prepare Model Weights
126
130
127
-
Complete the following steps when you set `--embedding-algorithm` to `internvideo2` or when you pre-stage models for offline use.
131
+
For most use cases, you only need to create a model directory. The required model files will be downloaded automatically on first run.
128
132
129
-
1. Create a model directory.
133
+
1. Create a model directory:
134
+
```bash
135
+
mkdir -p "$MODEL_DIR"
136
+
```
130
137
:::{tip}
131
138
You can reuse the same `<MODEL_DIR>` across runs.
132
139
:::
133
-
2. Download the IV2 Checkpoint from the [OpenGVLab page](https://github.com/OpenGVLab) and accept the terms.
134
-
3. Download the BERT model files for [`google-bert/bert-large-uncased`](https://huggingface.co/google-bert/bert-large-uncased).
0 commit comments