vLLM 0.14.0: _preprocess_chat Return Value Mismatch Causes ValueError

# vLLM 0.14.0: `_preprocess_chat` Return Value Mismatch Causes `ValueError`

## Problem Description

When using vLLM 0.14.0, the `OpenAIServingChat._preprocess_chat` method returns a different number of values than expected, causing a `ValueError: not enough values to unpack (expected 3, got 2)` error.

## Error Message

```
ValueError: not enough values to unpack (expected 3, got 2)
```

**Error Location:**
- File: `src/prime_rl/inference/vllm/serving_chat_with_tokens.py`
- Line: 100 (before fix)
- Method: `OpenAIServingChatWithTokens.create_chat_completion_with_tokens`

## Problem Details

### Expected Return Value Format
The code expects `_preprocess_chat` to return 3 values:
```python
conversation, request_prompts, engine_prompts = await self._preprocess_chat(...)
```

### Actual Return Value Format
In vLLM 0.14.0, `_preprocess_chat` actually returns only 2 values:
```python
# Actual return type: tuple[list[ConversationMessage], list[TokensPrompt]]
conversation, engine_prompts = await self._preprocess_chat(...)
```

### Verification

Verified by inspecting vLLM 0.14.0 source code and runtime behavior:

```python
from vllm.entrypoints.openai.serving_chat import OpenAIServingChat
import inspect

sig = inspect.signature(OpenAIServingChat._preprocess_chat)
print('Return annotation:', sig.return_annotation)
# Output: tuple[list[vllm.entrypoints.chat_utils.ConversationMessage], list[vllm.inputs.data.TokensPrompt]]

# Actual return statement (from serving_engine.py:855)
# return conversation, [engine_prompt]
```

## Environment Information

- **vLLM Version**: 0.14.0
- **Python Version**: 3.12.0
- **Operating System**: Linux
- **Related Code**: `OpenAIServingChatWithTokens` class in `prime-rl` project

## Reproduction Steps

1. Start inference server with vLLM 0.14.0
2. Call `/v1/chat/completions/tokens` endpoint
3. Trigger `create_chat_completion_with_tokens` method
4. Unpacking error occurs when executing `_preprocess_chat` call

## Workaround

Temporary fix implemented in `serving_chat_with_tokens.py`:

```python
# vLLM 0.14.0's _preprocess_chat returns (conversation, [engine_prompt])
# instead of (conversation, request_prompts, engine_prompts)
conversation, engine_prompts = await self._preprocess_chat(...)

# Construct request_prompts from engine_prompts for compatibility
request_prompts = []
for engine_prompt in engine_prompts:
    request_prompts.append({
        "prompt_token_ids": engine_prompt.get("prompt_token_ids", []),
    })
```

## Suggestions

1. **Documentation Update**: vLLM documentation should clearly state the return value format of `_preprocess_chat` across different versions
2. **API Stability**: Recommend vLLM maintain backward compatibility, or provide migration guides when updating versions
3. **Type Annotations**: Ensure type annotations match actual return values

## Related Files

- vLLM Source: `vllm/entrypoints/openai/serving_engine.py` (line 740-856)
- Affected Code: `src/prime_rl/inference/vllm/serving_chat_with_tokens.py` (line 100-124)

## Additional Information

This issue affects functionality using the custom `/v1/chat/completions/tokens` endpoint, which supports pre-tokenized prompt input. This is important for certain RL training scenarios.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vLLM 0.14.0: _preprocess_chat Return Value Mismatch Causes ValueError #1641

vLLM 0.14.0: `_preprocess_chat` Return Value Mismatch Causes `ValueError`

Problem Description

Error Message

Problem Details

Expected Return Value Format

Actual Return Value Format

Verification

Environment Information

Reproduction Steps

Workaround

Suggestions

Related Files

Additional Information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

vLLM 0.14.0: _preprocess_chat Return Value Mismatch Causes ValueError #1641

Description

vLLM 0.14.0: _preprocess_chat Return Value Mismatch Causes ValueError

Problem Description

Error Message

Problem Details

Expected Return Value Format

Actual Return Value Format

Verification

Environment Information

Reproduction Steps

Workaround

Suggestions

Related Files

Additional Information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

vLLM 0.14.0: `_preprocess_chat` Return Value Mismatch Causes `ValueError`