We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 5bd2584 commit 4cbc75cCopy full SHA for 4cbc75c
InterVL/InterVL3.md
@@ -17,7 +17,7 @@ uv pip install -U vllm --torch-backend auto
17
### Weights
18
[OpenGVLab/InternVL3-8B-hf](https://huggingface.co/OpenGVLab/InternVL3-8B)
19
20
-#### Running InternVL3-8B-hf model on A100-SXM4-40GB GPUs (2 cards) in eager mode
+### Running InternVL3-8B-hf model on A100-SXM4-40GB GPUs (2 cards) in eager mode
21
22
Launch the online inference server using TP=2:
23
```bash
@@ -96,7 +96,7 @@ vllm bench serve \
96
--random-input 2048 \
97
--random-output 1024 \
98
--max-concurrency 10 \
99
- --num-prompts 50\
+ --num-prompts 50 \
100
--ignore-eos
101
```
102
If it works successfully, you will see the following output.
0 commit comments