Skip to content

Conversation

suryabdev
Copy link
Contributor

This is a PR to fix, #1794

With VLLM > 0.10.1, The guided_options_request was deprecated. This moves to the recommened structured_outputs approach
https://github.com/vllm-project/vllm/blob/main/docs/features/structured_outputs.md#structured-outputs

!!! warning If you are still using the following deprecated API fields, please update your code to use structured_outputs as demonstrated in the rest of this document:
- `guided_json` -> `{"structured_outputs": {"json": ...}}` or `StructuredOutputsParams(json=...)`

Based on my understanding of the changelog (https://github.com/vllm-project/vllm/releases), They deprecated the parameter in 0.10.2 (Last month) and completely removed support for V0 APIs in 0.11 (Last week)

Tested it with the following code

from smolagents import VLLMModel, CodeAgent

model = VLLMModel(model_id="HuggingFaceTB/SmolLM2-360M-Instruct")
agent = CodeAgent(model=model, tools=[])
agent.run("print the first 10 integers")

and I was able to reproduce the issue

INFO 10-10 11:51:29 [llm.py:306] Supported_tasks: ['generate']
╭─────────────────────────────────────────────────────────────────────────────────────────────────── New run ───────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                                                                                                                               │
│ Hello                                                                                                                                                                                                         │
│                                                                                                                                                                                                               │
╰─ VLLMModel - HuggingFaceTB/SmolLM2-360M-Instruct ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Error in generating model output:
LLM.generate() got an unexpected keyword argument 'guided_options_request'

After the fix with vLLM 0.11
image

)
# Override the OpenAI schema for VLLM compatibility
guided_options_request = {"guided_json": response_format["json_schema"]["schema"]} if response_format else None
structured_outputs = StructuredOutputsParams(json=response_format["json_schema"]["schema"]) if response_format else None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qjflores suggested a fix when he opened the issue,

# Convert old guided_options_request format to new structured_outputs
    structured_outputs_params = None
    if response_format:
        if "json_schema" in response_format:
            # Extract the JSON schema from the response_format
            json_schema = response_format["json_schema"]["schema"]
            structured_outputs_params = StructuredOutputsParams(json=json_schema)
        elif "choice" in response_format:
            # Handle choice-based structured outputs
            structured_outputs_params = StructuredOutputsParams(choice=response_format["choice"])
        elif "regex" in response_format:
            # Handle regex-based structured outputs
            structured_outputs_params = StructuredOutputsParams(regex=response_format["regex"])
        elif "grammar" in response_format:
            # Handle grammar-based structured outputs
            structured_outputs_params = StructuredOutputsParams(grammar=response_format["grammar"])
        elif "structural_tag" in response_format:
            # Handle structural tag-based structured outputs
            structured_outputs_params = StructuredOutputsParams(structural_tag=response_format["structural_tag"])
        else:
            print(f"WARNING: Unsupported response_format type: {response_format}")
            structured_outputs_params = None

But if I understand correctly, JSON is the only structured output param that is used

additional_args["response_format"] = CODEAGENT_RESPONSE_FORMAT

CODEAGENT_RESPONSE_FORMAT description
"json_schema": {

So I simplified his solution and incorporated it in the PR

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense to me

]
vllm = [
"vllm",
"vllm>=0.10.2",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add some version control logic like the following

json_schema = response_format["json_schema"]["schema"]
if parse(vllm.__version__) >= parse("0.10.2"):
    from vllm.outputs import StructuredOutputsParams
    if response_format:
        structured_outputs = StructuredOutputsParams(json=json_schema)
else:
    if response_format:       
        guided_options_request = {"guided_json": json_schema}

But I think that adds unnecessary complexity. Might be simpler to just force the version to be >= 0.10.2

Copy link
Contributor Author

@suryabdev suryabdev Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with vLLM 0.10.0, With the new code it will break
image

ToolCallingAgent,
stream_to_gradio,
)
from smolagents.memory import ActionStep, AgentMemory
Copy link
Contributor Author

@suryabdev suryabdev Oct 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: Style changes were merged to main. make style from a different commit

@suryabdev
Copy link
Contributor Author

cc: @albertvillanova / @aymeric-roucher for review

@qjflores
Copy link

thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants