|
3 | 3 | <!-- [中文 README](README_cn.md) --> |
4 | 4 |
|
5 | 5 | [](https://paperswithcode.com/sota/action-classification-on-kinetics-400?p=internvideo2-scaling-video-foundation-models) |
6 | | - |
7 | 6 | [](https://paperswithcode.com/sota/action-classification-on-kinetics-600?p=internvideo2-scaling-video-foundation-models) |
8 | | - |
9 | 7 | [](https://paperswithcode.com/sota/action-classification-on-kinetics-700?p=internvideo2-scaling-video-foundation-models) |
10 | | - |
11 | 8 | [](https://paperswithcode.com/sota/action-recognition-in-videos-on-something?p=internvideo2-scaling-video-foundation-models) |
12 | | - |
13 | 9 | [](https://paperswithcode.com/sota/action-recognition-in-videos-on-activitynet?p=internvideo2-scaling-video-foundation-models) |
14 | | - |
15 | 10 | [](https://paperswithcode.com/sota/action-classification-on-moments-in-time?p=internvideo2-scaling-video-foundation-models) |
16 | | - |
17 | 11 | [](https://paperswithcode.com/sota/action-recognition-on-hacs?p=internvideo2-scaling-video-foundation-models) |
18 | | - |
19 | 12 | [](https://paperswithcode.com/sota/zero-shot-video-retrieval-on-msr-vtt?p=internvideo2-scaling-video-foundation-models) |
20 | | - |
21 | 13 | [](https://paperswithcode.com/sota/zero-shot-video-retrieval-on-msvd?p=internvideo2-scaling-video-foundation-models) |
22 | | - |
23 | 14 | [](https://paperswithcode.com/sota/zero-shot-video-retrieval-on-lsmdc?p=internvideo2-scaling-video-foundation-models) |
24 | | - |
25 | 15 | [](https://paperswithcode.com/sota/zero-shot-video-retrieval-on-didemo?p=internvideo2-scaling-video-foundation-models) |
26 | | - |
27 | 16 | [](https://paperswithcode.com/sota/zero-shot-video-retrieval-on-vatex?p=internvideo2-scaling-video-foundation-models) |
28 | | - |
29 | 17 | [](https://paperswithcode.com/sota/zero-shot-video-retrieval-on-activitynet?p=internvideo2-scaling-video-foundation-models) |
30 | | - |
31 | 18 | [](https://paperswithcode.com/sota/video-retrieval-on-msr-vtt?p=internvideo2-scaling-video-foundation-models) |
32 | | - |
33 | 19 | [](https://paperswithcode.com/sota/video-retrieval-on-didemo?p=internvideo2-scaling-video-foundation-models) |
34 | | - |
35 | 20 | [](https://paperswithcode.com/sota/video-retrieval-on-msvd?p=internvideo2-scaling-video-foundation-models) |
36 | | - |
37 | 21 | [](https://paperswithcode.com/sota/video-retrieval-on-lsmdc?p=internvideo2-scaling-video-foundation-models) |
38 | | - |
39 | 22 | [](https://paperswithcode.com/sota/video-retrieval-on-activitynet?p=internvideo2-scaling-video-foundation-models) |
40 | | - |
41 | 23 | [](https://paperswithcode.com/sota/video-retrieval-on-vatex?p=internvideo2-scaling-video-foundation-models) |
42 | | - |
43 | 24 | [](https://paperswithcode.com/sota/text-to-audio-retrieval-on-audiocaps?p=internvideo2-scaling-video-foundation-models) |
44 | | - |
45 | 25 | [](https://paperswithcode.com/sota/text-to-audio-retrieval-on-clotho?p=internvideo2-scaling-video-foundation-models) |
46 | | - |
47 | 26 | [](https://paperswithcode.com/sota/zero-shot-text-to-audio-retrieval-on?p=internvideo2-scaling-video-foundation-models) |
48 | | - |
49 | 27 | [](https://paperswithcode.com/sota/zero-shot-text-to-audio-retrieval-on-clotho?p=internvideo2-scaling-video-foundation-models) |
50 | | - |
51 | 28 | [](https://paperswithcode.com/sota/audio-classification-on-esc-50?p=internvideo2-scaling-video-foundation-models) |
52 | | - |
53 | 29 | [](https://paperswithcode.com/sota/video-grounding-on-qvhighlights?p=internvideo2-scaling-video-foundation-models) |
54 | | - |
55 | 30 | [](https://paperswithcode.com/sota/temporal-action-localization-on-fineaction?p=internvideo2-scaling-video-foundation-models) |
56 | | - |
57 | 31 | [](https://paperswithcode.com/sota/temporal-action-localization-on-hacs?p=internvideo2-scaling-video-foundation-models) |
58 | | - |
59 | 32 | [](https://paperswithcode.com/sota/temporal-action-localization-on-thumos14?p=internvideo2-scaling-video-foundation-models) |
60 | | - |
61 | 33 | [](https://paperswithcode.com/sota/temporal-action-localization-on-activitynet?p=internvideo2-scaling-video-foundation-models) |
62 | | - |
63 | 34 | [](https://paperswithcode.com/sota/zero-shot-video-question-answer-on-egoschema-1?p=internvideo2-scaling-video-foundation-models) |
64 | | - |
65 | 35 | [](https://paperswithcode.com/sota/video-instance-segmentation-on-youtube-vis-1?p=internvideo2-scaling-video-foundation-models) |
66 | 36 |
|
67 | 37 | This repo will give the code and models of '[InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding](https://arxiv.org/abs/2403.15377)' soon. |
|
0 commit comments