|
3 | 3 | </p> |
4 | 4 |
|
5 | 5 | # LLaVA-NeXT: Open Large Multimodal Models |
| 6 | +[](http://arxiv.org/abs/2410.02713) |
6 | 7 | [](https://arxiv.org/abs/2408.03326) |
7 | 8 | [](https://llava-vl.github.io/blog/) |
8 | 9 |
|
9 | 10 | [](https://llava-onevision.lmms-lab.com/) |
| 11 | +[](https://huggingface.co/spaces/WildVision/vision-arena) |
10 | 12 | [](https://huggingface.co/spaces/lmms-lab/LLaVA-NeXT-Interleave-Demo) |
11 | | -[](https://huggingface.co/spaces/WildVision/vision-arena) |
12 | 13 | [](https://openbayes.com/console/public/tutorials/gW0ng9jKXfO) |
13 | 14 |
|
| 15 | +[](https://huggingface.co/collections/lmms-lab/llava-next-video-661e86f5e8dabc3ff793c944) |
14 | 16 | [](https://huggingface.co/collections/lmms-lab/llava-onevision-66a259c3526e15166d6bba37) |
15 | 17 | [](https://huggingface.co/collections/lmms-lab/llava-next-interleave-66763c55c411b340b35873d1) |
16 | | -[](https://huggingface.co/collections/lmms-lab/llava-next-video-661e86f5e8dabc3ff793c944) |
17 | 18 | [](https://huggingface.co/lmms-lab) |
18 | 19 |
|
19 | 20 | ## Release Notes |
20 | 21 |
|
| 22 | +- **[2024/10/04] 🔥 LLaVA-Video** (formerly LLaVA-NeXT-Video) has undergone a major upgrade! We are excited to release **LLaVA-Video-178K**, a high-quality synthetic dataset for video instruction tuning. This dataset includes: |
| 23 | + |
| 24 | + - 178,510 caption entries |
| 25 | + - 960,792 open-ended Q&A pairs |
| 26 | + - 196,198 multiple-choice Q&A items |
| 27 | + |
| 28 | + Along with this, we’re also releasing the **LLaVA-Video 7B/72B models**, which deliver competitive performance on the latest video benchmarks, including [Video-MME](https://video-mme.github.io/home_page.html#leaderboard), [LongVideoBench](https://longvideobench.github.io/), and [Dream-1K](https://tarsier-vlm.github.io/). |
| 29 | + |
| 30 | + 📄 **Explore more**: |
| 31 | + - [LLaVA-Video-178K Dataset](https://huggingface.co/datasets/lmms-lab/LLaVA-Video-178K): Download the dataset. |
| 32 | + - [LLaVA-Video Models](https://huggingface.co/collections/lmms-lab/llava-video-661e86f5e8dabc3ff793c944): Access model checkpoints. |
| 33 | + - [Paper](http://arxiv.org/abs/2410.02713): Detailed information about LLaVA-Video. |
| 34 | + - [LLaVA-Video Documentation](https://github.com/LLaVA-VL/LLaVA-NeXT/blob/main/docs/LLaVA_Video_1003.md): Guidance on training, inference and evaluation. |
| 35 | + |
21 | 36 | - [2024/09/13] 🔥 **🚀 [LLaVA-OneVision-Chat](docs/LLaVA_OneVision_Chat.md)**. The new LLaVA-OV-Chat (7B/72B) significantly improves the chat experience of LLaVA-OV. 📄 |
22 | 37 |
|
23 | 38 |  |
|
0 commit comments