Skip to content

Commit 796fbcd

Browse files
suiyoubiayushdg
authored andcommitted
Purge InternVideo2 (#1451)
* Remove Internvideo2 Signed-off-by: Ao Tang <aot@nvidia.com> * more to remove Signed-off-by: Ao Tang <aot@nvidia.com> * fix writer Signed-off-by: Ao Tang <aot@nvidia.com> * Enhance Clip class to include cosmos_embed1_frames and cosmos_embed1_embedding in total size calculation Signed-off-by: Ao Tang <aot@nvidia.com> * remove iv2 Signed-off-by: Ao Tang <aot@nvidia.com> --------- Signed-off-by: Ao Tang <aot@nvidia.com> Co-authored-by: Ayush Dattagupta <ayushdg95@gmail.com> Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
1 parent 5273a91 commit 796fbcd

File tree

34 files changed

+60
-2365
lines changed

34 files changed

+60
-2365
lines changed

.cursor/rules/modality-structure.mdc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ Common operations:
5858
- Clip extraction
5959
- GPU H.264 encoding/decoding
6060
- Motion and aesthetic filtering
61-
- Embeddings (InternVideo2, Cosmos-Embed1)
61+
- Embeddings (Cosmos-Embed1)
6262

6363
Task type: `VideoTask`
6464

.github/workflows/cicd-main.yml

Lines changed: 0 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -144,26 +144,6 @@ jobs:
144144
- uses: actions/checkout@v4
145145
with:
146146
submodules: recursive
147-
148-
- name: Cache InternVideo
149-
id: cache-internvideo
150-
uses: actions/cache@v4
151-
with:
152-
path: InternVideo
153-
key: internvideo-${{ hashFiles('external/intern_video2_multimodal.patch') }}-09d872e5093296c6f36b8b3a91fc511b76433bf7
154-
155-
- name: Checkout InternVideo
156-
if: steps.cache-internvideo.outputs.cache-hit != 'true'
157-
uses: actions/checkout@v4
158-
with:
159-
repository: OpenGVLab/InternVideo
160-
path: InternVideo
161-
ref: 09d872e5093296c6f36b8b3a91fc511b76433bf7
162-
163-
- name: Patch InternVideo
164-
if: steps.cache-internvideo.outputs.cache-hit != 'true'
165-
run: cd InternVideo && patch -p1 < ../external/intern_video2_multimodal.patch
166-
167147
- name: Free up disk space on Ubuntu
168148
run: |
169149
sudo rm -rf /usr/share/dotnet
@@ -178,9 +158,7 @@ jobs:
178158
- name: Run tests ${{ matrix.folder }} (CPU)
179159
timeout-minutes: 40
180160
run: |
181-
sed -i "/InternVideo/d" .gitignore
182161
uv sync --link-mode copy --locked --extra audio_cpu --extra text_cpu --extra video_cpu --group test
183-
uv add InternVideo/InternVideo2/multi_modality
184162
FOLDER="${{ matrix.folder }}"
185163
FOLDER="${FOLDER/stages-/stages/}"
186164
uv run coverage run --branch --source=nemo_curator -m pytest -v "tests/$FOLDER" -m "not gpu"

.gitignore

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,3 @@ data/
155155

156156
# macOS Files
157157
.DS_Store
158-
159-
# InternVideo2 dependency (cloned by installation script)
160-
InternVideo/

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ NeMo Curator now supports comprehensive [video data curation](https://docs.nvidi
3535
- **Video splitting**: [Fixed-stride](https://docs.nvidia.com/nemo/curator/latest/curate-video/process-data/clipping.html) and [scene-change detection (TransNetV2)](https://docs.nvidia.com/nemo/curator/latest/curate-video/process-data/clipping.html) for clip extraction
3636
- **Semantic deduplication**: [K-means clustering and pairwise similarity](https://docs.nvidia.com/nemo/curator/latest/curate-video/process-data/dedup.html) for near-duplicate clip removal
3737
- **Content filtering**: [Motion-based filtering](https://docs.nvidia.com/nemo/curator/latest/curate-video/process-data/filtering.html) and [aesthetic filtering](https://docs.nvidia.com/nemo/curator/latest/curate-video/process-data/filtering.html) for quality improvement
38-
- **Embedding generation**: InternVideo2 and Cosmos-Embed1 models for clip-level embeddings
38+
- **Embedding generation**: Cosmos-Embed1 models for clip-level embeddings
3939
- **Ray-based distributed architecture**: Scalable video processing with autoscaling support
4040

4141
#### Audio

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ Process large-scale video corpora with distributed, GPU-accelerated pipelines fo
7474
| **Data Loading** | Local paths • S3-compatible storage • HTTP(S) URLs | [Load Data](https://docs.nvidia.com/nemo/curator/latest/curate-video/load-data/index.html) |
7575
| **Clipping** | Fixed-stride splitting • Scene-change detection (TransNetV2) | [Clipping](https://docs.nvidia.com/nemo/curator/latest/curate-video/process-data/clipping.html) |
7676
| **Processing** | GPU H.264 encoding • Frame extraction • Motion filtering • Aesthetic filtering | [Processing](https://docs.nvidia.com/nemo/curator/latest/curate-video/process-data/filtering.html) |
77-
| **Embeddings** | InternVideo2 and Cosmos-Embed1 for clip-level embeddings | [Embeddings](https://docs.nvidia.com/nemo/curator/latest/curate-video/process-data/embeddings.html) |
77+
| **Embeddings** | Cosmos-Embed1 for clip-level embeddings | [Embeddings](https://docs.nvidia.com/nemo/curator/latest/curate-video/process-data/embeddings.html) |
7878
| **Deduplication** | K-means clustering • Pairwise similarity for near-duplicates | [Deduplication](https://docs.nvidia.com/nemo/curator/latest/curate-video/process-data/dedup.html) |
7979

8080
---

docker/Dockerfile

Lines changed: 1 addition & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -62,16 +62,6 @@ COPY docker/common/install_ffmpeg.sh .
6262
RUN bash install_ffmpeg.sh && \
6363
rm install_ffmpeg.sh
6464

65-
66-
ARG INTERN_VIDEO_COMMIT=09d872e5093296c6f36b8b3a91fc511b76433bf7
67-
COPY external/intern_video2_multimodal.patch .
68-
# Clone InternVideo (Video curation dependency)
69-
RUN git clone https://github.com/OpenGVLab/InternVideo.git && \
70-
cd InternVideo && \
71-
git checkout ${INTERN_VIDEO_COMMIT} && \
72-
patch -p1 < /opt/intern_video2_multimodal.patch && \
73-
rm /opt/intern_video2_multimodal.patch
74-
7565
FROM nemo_curator_dep AS nemo_curator
7666

7767
WORKDIR /opt/Curator
@@ -81,8 +71,7 @@ COPY pyproject.toml uv.lock /opt/Curator/
8171
COPY nemo_curator/__init__.py nemo_curator/package_info.py /opt/Curator/nemo_curator/
8272

8373
# Install Curator
84-
RUN uv sync --link-mode copy --locked --extra all --all-groups --no-cache && \
85-
uv add /opt/InternVideo/InternVideo2/multi_modality
74+
RUN uv sync --link-mode copy --locked --extra all --all-groups --no-cache
8675

8776
COPY . /opt/Curator
8877

docs/about/concepts/video/abstractions.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ A stage represents a single step in your data curation workflow. Video stages ar
3838
- **Input/Output**: Read video files and write processed outputs to storage ([Save & Export Documentation](video-save-export))
3939
- **Video Clipping**: Split videos into clips using fixed stride or scene-change detection ([Video Clipping Documentation](video-process-clipping))
4040
- **Frame Extraction**: Extract frames from videos or clips for analysis and embeddings ([Frame Extraction Documentation](video-process-frame-extraction))
41-
- **Embedding Generation**: Generate clip-level embeddings using InternVideo2 or Cosmos-Embed1 models ([Embeddings Documentation](video-process-embeddings))
41+
- **Embedding Generation**: Generate clip-level embeddings using Cosmos-Embed1 models ([Embeddings Documentation](video-process-embeddings))
4242
- **Filtering**: Filter clips based on motion analysis and aesthetic quality scores ([Filtering Documentation](video-process-filtering))
4343
- **Caption and Preview**: Generate captions and preview images from video clips ([Captions & Preview Documentation](video-process-captions-preview))
4444
- **Deduplication**: Remove near-duplicate clips using embedding-based clustering ([Duplicate Removal Documentation](video-process-dedup))

docs/about/concepts/video/data-flow.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,6 @@ Writer stages produce the following directories under the configured output path
2929
- `filtered_clips/`: MP4 files for filtered clips
3030
- `previews/`: WebP preview images for windows
3131
- `metas/v0/`: Per-clip JSON metadata files
32-
- `iv2_embd/`: Per-clip embeddings (pickle) for InternVideo2
33-
- `iv2_embd_parquet/`: Aggregated per-video embeddings (parquet) for InternVideo2
3432
- `ce1_embd/`: Per-clip embeddings (pickle) for Cosmos-Embed1
3533
- `ce1_embd_parquet/`: Aggregated per-video embeddings (parquet) for Cosmos-Embed1
3634
- `processed_videos/`: Per-video JSON metadata files

docs/about/release-notes/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ NeMo Curator now supports comprehensive [video data curation](../../curate-video
6464
- **Video splitting**: [Fixed-stride](../../curate-video/process-data/clipping.md) and [scene-change detection (TransNetV2)](../../curate-video/process-data/clipping.md) for clip extraction
6565
- **Semantic deduplication**: [K-means clustering and pairwise similarity](../../curate-video/process-data/dedup.md) for near-duplicate clip removal
6666
- **Content filtering**: [Motion-based filtering](../../curate-video/process-data/filtering.md) and [aesthetic filtering](../../curate-video/process-data/filtering.md) for quality improvement
67-
- **Embedding generation**: InternVideo2 and Cosmos-Embed1 models for clip-level embeddings
67+
- **Embedding generation**: Cosmos-Embed1 models for clip-level embeddings
6868
- **Enhanced captioning**: [VL-based caption generation with optional LLM-based rewriting](../../curate-video/process-data/captions-preview.md) (Qwen-VL and Qwen-LM supported) for detailed video descriptions
6969
- **Ray-based distributed architecture**: Scalable video processing with [autoscaling support](../concepts/video/architecture.md)
7070

docs/admin/installation.md

Lines changed: 0 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -87,13 +87,6 @@ curl -LsSf https://astral.sh/uv/install.sh | sh
8787
uv sync --all-extras --all-groups
8888
```
8989

90-
Optional InternVideo2 installation steps:
91-
92-
```bash
93-
bash external/intern_video2_installation.sh
94-
uv add InternVideo/InternVideo2/multi_modality
95-
```
96-
9790
:::
9891

9992
:::{tab-item} Container Installation
@@ -164,40 +157,6 @@ If encoders are missing, reinstall `FFmpeg` with the required options or use the
164157
:::
165158
::::
166159

167-
### InternVideo2 Support (Optional for Video)
168-
169-
Video processing includes optional support for InternVideo2. To install InternVideo2, run these commands before installing NeMo Curator based on whether you install via PyPI or from source:
170-
171-
::::{tab-set}
172-
173-
:::{tab-item} PyPI Installation
174-
```bash
175-
# Clone and set up InternVideo2
176-
git clone https://github.com/OpenGVLab/InternVideo.git
177-
cd InternVideo
178-
git checkout 09d872e5093296c6f36b8b3a91fc511b76433bf7
179-
180-
# Download and apply NeMo Curator patch
181-
curl -fsSL https://raw.githubusercontent.com/NVIDIA/NeMo-Curator/main/external/intern_video2_multimodal.patch -o intern_video2_multimodal.patch
182-
patch -p1 < intern_video2_multimodal.patch
183-
cd ..
184-
185-
# Add InternVideo2 to the environment
186-
uv add InternVideo/InternVideo2/multi_modality
187-
```
188-
189-
:::
190-
191-
:::{tab-item} Source Installation
192-
```bash
193-
# Inside the NeMo Curator folder
194-
bash external/intern_video2_installation.sh
195-
uv add InternVideo/InternVideo2/multi_modality
196-
```
197-
198-
:::
199-
::::
200-
201160
---
202161

203162
## Package Extras

0 commit comments

Comments
 (0)