1414- ` 2024/01/24 ` : InternVL-Chat-V1.1 is released, it supports Chinese and has stronger OCR capability, see [ here] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1 ) or try our [ demo] ( https://internvl.opengvlab.com/ ) .
1515- ` 2024/01/16 ` : We release our [ customized mmcv/mmsegmentation/mmdetection code] ( https://github.com/OpenGVLab/InternVL-MMDetSeg ) , integrated with DeepSpeed, which can be used for training large-scale object detection and semantic segmentation models.
1616
17-
1817## Compared with SOTA VLLMs
1918
2019<img width =" 1229 " alt =" image " src =" https://github.com/OpenGVLab/InternVL/assets/23737120/e9065a58-86fa-47ef-be9a-eb734532e73f " >
@@ -29,26 +28,25 @@ InternVL scales up the ViT to _**6B parameters**_ and aligns it with LLM.
2928
3029** Vision Large Language Model**
3130
32- | Model | Date | Download | Note |
33- | ----------------------- | ---------- | ------------------------------------------------------------------------------------ | ---------------------------------- |
34- | InternVL−Chat−V1.5 | 2024.04.18 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5 ) | support 4K image; super strong OCR; Approaching the performance of GPT-4V and Gemini Pro on various benchmarks like MMMU, DocVQA, ChartQA, MathVista, etc. (🔥new)|
35- | InternVL−Chat−V1.2−Plus | 2024.02.21 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2-Plus ) | more SFT data and stronger |
36- | InternVL−Chat−V1.2 | 2024.02.11 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2 ) | scaling up LLM to 34B |
37- | InternVL−Chat−V1.1 | 2024.01.24 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1 ) | support Chinese and stronger OCR |
38- | InternVL−Chat−19B−448px | 2024.02.03 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B-448px ) | 448 resolution |
39- | InternVL−Chat−19B | 2023.12.25 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B ) | English multimodal dialogue |
40- | InternVL−Chat−13B | 2023.12.25 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B ) | English multimodal dialogue |
41-
31+ | Model | Date | Download | Note |
32+ | ----------------------- | ---------- | ------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
33+ | InternVL−Chat−V1.5 | 2024.04.18 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5 ) | support 4K image; super strong OCR; Approaching the performance of GPT-4V and Gemini Pro on various benchmarks like MMMU, DocVQA, ChartQA, MathVista, etc. (🔥new) |
34+ | InternVL−Chat−V1.2−Plus | 2024.02.21 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2-Plus ) | more SFT data and stronger |
35+ | InternVL−Chat−V1.2 | 2024.02.11 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-2 ) | scaling up LLM to 34B |
36+ | InternVL−Chat−V1.1 | 2024.01.24 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-V1-1 ) | support Chinese and stronger OCR |
37+ | InternVL−Chat−19B−448px | 2024.02.03 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B-448px ) | 448 resolution |
38+ | InternVL−Chat−19B | 2023.12.25 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B ) | English multimodal dialogue |
39+ | InternVL−Chat−13B | 2023.12.25 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B ) | English multimodal dialogue |
4240
4341** Vision-Language Foundation Model**
4442
45- | Model | Date | Download | Note |
46- | ----------------------- | ---------- | ---------------------------------------------------------------------- | -------------------------------- |
43+ | Model | Date | Download | Note |
44+ | ----------------------- | ---------- | ---------------------------------------------------------------------- | ---------------------------------------------------- |
4745| InternViT−6B−448px−V1.5 | 2024.04.20 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-5 ) | support dynamic resolution, super strong OCR (🔥new) |
48- | InternViT−6B−448px−V1.2 | 2024.02.11 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2 ) | 448 resolution |
49- | InternViT−6B−448px−V1.0 | 2024.01.30 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-0 ) | 448 resolution |
50- | InternViT−6B−224px | 2023.12.22 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternViT-6B-224px ) | vision foundation model |
51- | InternVL−14B−224px | 2023.12.22 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-14B-224px ) | vision-language foundation model |
46+ | InternViT−6B−448px−V1.2 | 2024.02.11 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2 ) | 448 resolution |
47+ | InternViT−6B−448px−V1.0 | 2024.01.30 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-0 ) | 448 resolution |
48+ | InternViT−6B−224px | 2023.12.22 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternViT-6B-224px ) | vision foundation model |
49+ | InternVL−14B−224px | 2023.12.22 | 🤗 [ HF link] ( https://huggingface.co/OpenGVLab/InternVL-14B-224px ) | vision-language foundation model |
5250
5351## What can InternVL do?
5452
@@ -578,47 +576,46 @@ response = model.chat(tokenizer, pixel_values, question, generation_config)
578576 <summary >Launch a local chat demo (click to expand)</summary >
579577
580578** Launch a controller**
581-
582- ``` shell
583- # run the command in the `internvl_chat_llava` folder
584- python -m llava.serve.controller --host 0.0.0.0 --port 10000
585- ```
579+
580+ ``` shell
581+ # run the command in the `internvl_chat_llava` folder
582+ python -m llava.serve.controller --host 0.0.0.0 --port 10000
583+ ```
586584
587585** Launch a gradio web server**
588-
589- ``` shell
590- # run the command in the `internvl_chat_llava` folder
591- python -m llava.serve.gradio_web_server --controller http://localhost:10000 --model-list-mode reload
592- ```
593586
594- ** Launch a model worker**
595-
596- ``` shell
597- # OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B
598- # run the command in the `internvl_chat_llava` folder
599- python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --worker http://localhost:40000 --model-path OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B
600-
601- # OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B
602- # run the command in the `internvl_chat_llava` folder
603- python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40001 --worker http://localhost:40001 --model-path OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B
604-
605- # OpenGVLab/InternVL-Chat-V1-1
606- # run the command in the `internvl_chat` folder
607- python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40002 --worker http://localhost:40002 --model-path OpenGVLab/InternVL-Chat-V1-1
608-
609- # OpenGVLab/InternVL-Chat-V1-2
610- # run the command in the `internvl_chat` folder
611- python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40003 --worker http://localhost:40003 --model-path OpenGVLab/InternVL-Chat-V1-2
612-
613- # OpenGVLab/InternVL-Chat-V1-2-Plus
614- # run the command in the `internvl_chat` folder
615- python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40004 --worker http://localhost:40004 --model-path OpenGVLab/InternVL-Chat-V1-2-Plus
616-
617- # OpenGVLab/InternVL-Chat-V1-5
618- # run the command in the `internvl_chat` folder
619- python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40005 --worker http://localhost:40005 --model-path OpenGVLab/InternVL-Chat-V1-5
587+ ``` shell
588+ # run the command in the `internvl_chat_llava` folder
589+ python -m llava.serve.gradio_web_server --controller http://localhost:10000 --model-list-mode reload
620590```
621591
592+ ** Launch a model worker**
593+
594+ ``` shell
595+ # OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B
596+ # run the command in the `internvl_chat_llava` folder
597+ python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --worker http://localhost:40000 --model-path OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B
598+
599+ # OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B
600+ # run the command in the `internvl_chat_llava` folder
601+ python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40001 --worker http://localhost:40001 --model-path OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B
602+
603+ # OpenGVLab/InternVL-Chat-V1-1
604+ # run the command in the `internvl_chat` folder
605+ python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40002 --worker http://localhost:40002 --model-path OpenGVLab/InternVL-Chat-V1-1
606+
607+ # OpenGVLab/InternVL-Chat-V1-2
608+ # run the command in the `internvl_chat` folder
609+ python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40003 --worker http://localhost:40003 --model-path OpenGVLab/InternVL-Chat-V1-2
610+
611+ # OpenGVLab/InternVL-Chat-V1-2-Plus
612+ # run the command in the `internvl_chat` folder
613+ python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40004 --worker http://localhost:40004 --model-path OpenGVLab/InternVL-Chat-V1-2-Plus
614+
615+ # OpenGVLab/InternVL-Chat-V1-5
616+ # run the command in the `internvl_chat` folder
617+ python -m internvl.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40005 --worker http://localhost:40005 --model-path OpenGVLab/InternVL-Chat-V1-5
618+ ```
622619
623620</details >
624621
0 commit comments