Skip to content

Conversation

shen-shanshan
Copy link
Collaborator

What this PR does / why we need it?

Remove legacy input mapper/processor from V0.

Find more details at #673 and vllm-project/vllm#15686.

Does this PR introduce any user-facing change?

no.

How was this patch tested?

Launch online service:

vllm serve Qwen/Qwen2.5-VL-7B-Instruct \
--dtype bfloat16 \
--max_model_len 32768 \
--max-num-batched-tokens 32768

Query the server:

curl http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
    "model": "Qwen/Qwen2.5-VL-7B-Instruct",
    "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"}},
        {"type": "text", "text": "What is the text in the illustrate?"}
    ]}
    ]
    }'

Result:

{"id":"chatcmpl-619e70733ed148b3be3a0b6524ee0ef3","object":"chat.completion","created":1748226332,"model":"/home/sss/.cache/modelscope/hub/models/Qwen/Qwen2___5-VL-7B-Instruct","choices":[{"index":0,"message":{"role":"assistant","reasoning_content":null,"content":"The text in the illustration reads \"TONGYI Qwen.\"","tool_calls":[]},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"pro

@wangxiyuan wangxiyuan added the ready read for review label May 28, 2025
@shen-shanshan shen-shanshan force-pushed the mm branch 2 times, most recently from ec5ae94 to b431db1 Compare May 29, 2025 03:05
@shen-shanshan
Copy link
Collaborator Author

I have rebased on the latest main and nothing changed.

@shen-shanshan
Copy link
Collaborator Author

shen-shanshan commented Jun 3, 2025

@wangxiyuan The CI of this PR is passed.

@wangxiyuan wangxiyuan merged commit 9386057 into vllm-project:main Jun 3, 2025
20 checks passed
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 3, 2025
vllm-project#951)

### What this PR does / why we need it?
Remove legacy input mapper/processor from V0.

Find more details at
vllm-project#673 and
vllm-project/vllm#15686.

### Does this PR introduce _any_ user-facing change?
no.

### How was this patch tested?
Launch online service:

```bash
vllm serve Qwen/Qwen2.5-VL-7B-Instruct \
--dtype bfloat16 \
--max_model_len 32768 \
--max-num-batched-tokens 32768
```

Query the server:

```bash
curl http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
    "model": "Qwen/Qwen2.5-VL-7B-Instruct",
    "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"}},
        {"type": "text", "text": "What is the text in the illustrate?"}
    ]}
    ]
    }'
```

Result:

```bash
{"id":"chatcmpl-619e70733ed148b3be3a0b6524ee0ef3","object":"chat.completion","created":1748226332,"model":"/home/sss/.cache/modelscope/hub/models/Qwen/Qwen2___5-VL-7B-Instruct","choices":[{"index":0,"message":{"role":"assistant","reasoning_content":null,"content":"The text in the illustration reads \"TONGYI Qwen.\"","tool_calls":[]},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"pro
```

Signed-off-by: shen-shanshan <[email protected]>
Co-authored-by: wangxiyuan <[email protected]>
Signed-off-by: wangxiaoxin (A) <[email protected]>
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 3, 2025
vllm-project#951)

### What this PR does / why we need it?
Remove legacy input mapper/processor from V0.

Find more details at
vllm-project#673 and
vllm-project/vllm#15686.

### Does this PR introduce _any_ user-facing change?
no.

### How was this patch tested?
Launch online service:

```bash
vllm serve Qwen/Qwen2.5-VL-7B-Instruct \
--dtype bfloat16 \
--max_model_len 32768 \
--max-num-batched-tokens 32768
```

Query the server:

```bash
curl http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
    "model": "Qwen/Qwen2.5-VL-7B-Instruct",
    "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"}},
        {"type": "text", "text": "What is the text in the illustrate?"}
    ]}
    ]
    }'
```

Result:

```bash
{"id":"chatcmpl-619e70733ed148b3be3a0b6524ee0ef3","object":"chat.completion","created":1748226332,"model":"/home/sss/.cache/modelscope/hub/models/Qwen/Qwen2___5-VL-7B-Instruct","choices":[{"index":0,"message":{"role":"assistant","reasoning_content":null,"content":"The text in the illustration reads \"TONGYI Qwen.\"","tool_calls":[]},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"pro
```

Signed-off-by: shen-shanshan <[email protected]>
Co-authored-by: wangxiyuan <[email protected]>
Signed-off-by: wangxiaoxin (A) <[email protected]>
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 3, 2025
vllm-project#951)

### What this PR does / why we need it?
Remove legacy input mapper/processor from V0.

Find more details at
vllm-project#673 and
vllm-project/vllm#15686.

### Does this PR introduce _any_ user-facing change?
no.

### How was this patch tested?
Launch online service:

```bash
vllm serve Qwen/Qwen2.5-VL-7B-Instruct \
--dtype bfloat16 \
--max_model_len 32768 \
--max-num-batched-tokens 32768
```

Query the server:

```bash
curl http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
    "model": "Qwen/Qwen2.5-VL-7B-Instruct",
    "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"}},
        {"type": "text", "text": "What is the text in the illustrate?"}
    ]}
    ]
    }'
```

Result:

```bash
{"id":"chatcmpl-619e70733ed148b3be3a0b6524ee0ef3","object":"chat.completion","created":1748226332,"model":"/home/sss/.cache/modelscope/hub/models/Qwen/Qwen2___5-VL-7B-Instruct","choices":[{"index":0,"message":{"role":"assistant","reasoning_content":null,"content":"The text in the illustration reads \"TONGYI Qwen.\"","tool_calls":[]},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"pro
```

Signed-off-by: shen-shanshan <[email protected]>
Co-authored-by: wangxiyuan <[email protected]>
Signed-off-by: wangxiaoxin (A) <[email protected]>
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 3, 2025
vllm-project#951)

### What this PR does / why we need it?
Remove legacy input mapper/processor from V0.

Find more details at
vllm-project#673 and
vllm-project/vllm#15686.

### Does this PR introduce _any_ user-facing change?
no.

### How was this patch tested?
Launch online service:

```bash
vllm serve Qwen/Qwen2.5-VL-7B-Instruct \
--dtype bfloat16 \
--max_model_len 32768 \
--max-num-batched-tokens 32768
```

Query the server:

```bash
curl http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
    "model": "Qwen/Qwen2.5-VL-7B-Instruct",
    "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"}},
        {"type": "text", "text": "What is the text in the illustrate?"}
    ]}
    ]
    }'
```

Result:

```bash
{"id":"chatcmpl-619e70733ed148b3be3a0b6524ee0ef3","object":"chat.completion","created":1748226332,"model":"/home/sss/.cache/modelscope/hub/models/Qwen/Qwen2___5-VL-7B-Instruct","choices":[{"index":0,"message":{"role":"assistant","reasoning_content":null,"content":"The text in the illustration reads \"TONGYI Qwen.\"","tool_calls":[]},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"pro
```

Signed-off-by: shen-shanshan <[email protected]>
Co-authored-by: wangxiyuan <[email protected]>
Signed-off-by: wangxiaoxin (A) <[email protected]>
David9857 pushed a commit to David9857/vllm-ascend that referenced this pull request Jun 3, 2025
vllm-project#951)

### What this PR does / why we need it?
Remove legacy input mapper/processor from V0.

Find more details at
vllm-project#673 and
vllm-project/vllm#15686.

### Does this PR introduce _any_ user-facing change?
no.

### How was this patch tested?
Launch online service:

```bash
vllm serve Qwen/Qwen2.5-VL-7B-Instruct \
--dtype bfloat16 \
--max_model_len 32768 \
--max-num-batched-tokens 32768
```

Query the server:

```bash
curl http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
    "model": "Qwen/Qwen2.5-VL-7B-Instruct",
    "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"}},
        {"type": "text", "text": "What is the text in the illustrate?"}
    ]}
    ]
    }'
```

Result:

```bash
{"id":"chatcmpl-619e70733ed148b3be3a0b6524ee0ef3","object":"chat.completion","created":1748226332,"model":"/home/sss/.cache/modelscope/hub/models/Qwen/Qwen2___5-VL-7B-Instruct","choices":[{"index":0,"message":{"role":"assistant","reasoning_content":null,"content":"The text in the illustration reads \"TONGYI Qwen.\"","tool_calls":[]},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"pro
```

Signed-off-by: shen-shanshan <[email protected]>
Co-authored-by: wangxiyuan <[email protected]>
momo609 pushed a commit to momo609/vllm-ascend that referenced this pull request Jun 4, 2025
vllm-project#951)

### What this PR does / why we need it?
Remove legacy input mapper/processor from V0.

Find more details at
vllm-project#673 and
vllm-project/vllm#15686.

### Does this PR introduce _any_ user-facing change?
no.

### How was this patch tested?
Launch online service:

```bash
vllm serve Qwen/Qwen2.5-VL-7B-Instruct \
--dtype bfloat16 \
--max_model_len 32768 \
--max-num-batched-tokens 32768
```

Query the server:

```bash
curl http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
    "model": "Qwen/Qwen2.5-VL-7B-Instruct",
    "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"}},
        {"type": "text", "text": "What is the text in the illustrate?"}
    ]}
    ]
    }'
```

Result:

```bash
{"id":"chatcmpl-619e70733ed148b3be3a0b6524ee0ef3","object":"chat.completion","created":1748226332,"model":"/home/sss/.cache/modelscope/hub/models/Qwen/Qwen2___5-VL-7B-Instruct","choices":[{"index":0,"message":{"role":"assistant","reasoning_content":null,"content":"The text in the illustration reads \"TONGYI Qwen.\"","tool_calls":[]},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"pro
```

Signed-off-by: shen-shanshan <[email protected]>
Co-authored-by: wangxiyuan <[email protected]>
Signed-off-by: wangxiaoxin (A) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready read for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants