[BUG] Tools are not invoked when used with vLLM's OpenAI compatible endpoint

### Checks

- [x] I have updated to the lastest minor and patch version of Strands
- [x] I have checked the documentation and this is not expected behavior
- [x] I have searched [./issues](./issues?q=) and there are no duplicates of my issue

### Strands Version

1.12.0

### Python Version

3.13.5

### Operating System

Ubuntu 22.04.4 LTS

### Installation Method

pip

### Steps to Reproduce

I have a vLLM server with a Qwen3-coder model and am hitting the OpenAI compatible endpoint.
Steps to reproduce:
- pip install strands-agents[openai]
- Configure the OpenAI model provider with the vLLM base_url and model (see below)
- Configure strands with the OpenAI model provider
- Invoke the agent

```
from strands import Agent

@tool
def get_weather(city: str) -> str:
    """returns weather info for the specified city."""
    return f"The weather in {city} is sunny with a high of 85F and a low of 58F"

model = OpenAIModel(
    client_args={
        "api_key": "EMPTY",
        "base_url": "<my base url>",
    },
    # **model_config
    model_id="<my Qwen3-coder model path>",
    params={
        "max_tokens": 1000,
        "temperature": 0.7,
    }
)

# Initialize the agent with tools, model, and configuration
agent = Agent(model=model,
              tools=[get_weather],
              system_prompt="You are the chief meteorologist at a TV station. When asked, you should look up the weather in the specified city and provide a colorful response"
              )
result = agent(""What is the weather in San Francisco?"")
print(result.message)
```

I've also tried replacing the `get_weather` tool with the `calculator` tool from the strands-agents-tools package, and that doesn't invoke the tool either.

### Expected Behavior

"The weather in San Francisco is sunny with a high of 85F and a low of 58F" or something like that.

FWIW, the same tool when used with the openai-agents SDK (with the relevant annotation) and the same vLLM server+ model combination generates the weather report as expected. 


### Actual Behavior

The result of running the code prints out:
```
<tool_call>
<function=get_weather>
<parameter=city>
San Francisco
</parameter>
</function>
</tool_call>
```
instead of the weather.

Here's the entire result:
```
AgentResult(stop_reason='end_turn',
            message={'content': [{'text': '<tool_call>\n'
                                          '<function=get_weather>\n'
                                          '<parameter=city>\n'
                                          'San Francisco\n'
                                          '</parameter>\n'
                                          '</function>\n'
                                          '</tool_call>'}],
                     'role': 'assistant'},
            metrics=EventLoopMetrics(cycle_count=1,
                                     tool_metrics={},
                                     cycle_durations=[0.3705911636352539],
                                     traces=[<strands.telemetry.metrics.Trace object at 0x7fc473363460>],
                                     accumulated_usage={'inputTokens': 310,
                                                        'outputTokens': 23,
                                                        'totalTokens': 333},
                                     accumulated_metrics={'latencyMs': 0}),
            state={})
```

### Additional Context

I also issued a direct chat.completions request against the server using the OpenAI client. Here's that response:
```
ChatCompletion(id='chatcmpl-6f6f42cf28014456893aee0f6fa61b2e', choices=[Choice(finish_reason='tool_calls', index=0, logprobs=None, message=ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=[ChatCompletionMessageFunctionToolCall(id='chatcmpl-tool-15fa2e41b38a4a5aadece55724056f4a', function=Function(arguments='{"city": "San Francisco"}', name='get_weather'), type='function')], reasoning_content=None), stop_reason=None, token_ids=None)], created=1760142049, model='<my Qwen3-coder model name>', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=23, prompt_tokens=300, total_tokens=323, completion_tokens_details=None, prompt_tokens_details=None), prompt_logprobs=None, prompt_token_ids=None, kv_transfer_params=None)
```

The `finish_reason` is set to `tool_calls` as expected, and the tool call and arguments seem fine to me.

### Possible Solution

_No response_

### Related Issues

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Tools are not invoked when used with vLLM's OpenAI compatible endpoint #1022

Checks

Strands Version

Python Version

Operating System

Installation Method

Steps to Reproduce

Expected Behavior

Actual Behavior

Additional Context

Possible Solution

Related Issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Tools are not invoked when used with vLLM's OpenAI compatible endpoint #1022

Description

Checks

Strands Version

Python Version

Operating System

Installation Method

Steps to Reproduce

Expected Behavior

Actual Behavior

Additional Context

Possible Solution

Related Issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions