-
Notifications
You must be signed in to change notification settings - Fork 13.3k
qwen3-coder tool call parser #15019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
qwen3-coder tool call parser #15019
Conversation
![]() Note the excessive 0s in the test 6 response. ╰─ ./test_tool_calls.sh
Test 1: Simple single parameter function
{"choices":[{"finish_reason":"tool_calls","index":0,"message":{"role":"assistant","content":null,"tool_calls":[{"type":"function","function":{"name":"get_current_time","arguments":"{\"timezone\":\"UTC\"}"},"id":"hjjtHH2uyKY2JE1iOAMpO8geQI3a7Kq8"}]}}],"created":1754076607,"model":"gpt-3.5-turbo","system_fingerprint":"b6058-0f5ccd6f","object":"chat.completion","usage":{"completion_tokens":23,"prompt_tokens":291,"total_tokens":314},"id":"chatcmpl-Oh1CUFa7Uje5DBZdlPn2ymvUHHV1VfGO","timings":{"prompt_n":291,"prompt_ms":384.05,"prompt_per_token_ms":1.3197594501718213,"prompt_per_second":757.7138393438355,"predicted_n":23,"predicted_ms":386.682,"predicted_per_token_ms":16.81226086956522,"predicted_per_second":59.48039991517578}}
---
Test 2: Multiple parameters with different types
{"choices":[{"finish_reason":"tool_calls","index":0,"message":{"role":"assistant","content":null,"tool_calls":[{"type":"function","function":{"name":"search_web","arguments":"{\"query\":\"Python machine learning tutorials\",\"max_results\":10,\"safe_search\":true}"},"id":"KaUBUGLOC8tJ7rRejdwoGxgnTIgX7Pcx"}]}}],"created":1754076608,"model":"gpt-3.5-turbo","system_fingerprint":"b6058-0f5ccd6f","object":"chat.completion","usage":{"completion_tokens":44,"prompt_tokens":352,"total_tokens":396},"id":"chatcmpl-tbE2CyT6u51I5w5hvmTkz5IAWIEqfopF","timings":{"prompt_n":313,"prompt_ms":380.837,"prompt_per_token_ms":1.2167316293929713,"prompt_per_second":821.8739250650541,"predicted_n":44,"predicted_ms":785.807,"predicted_per_token_ms":17.85925,"predicted_per_second":55.993392779652}}
---
Test 3: Multiple tools available
{"choices":[{"finish_reason":"tool_calls","index":0,"message":{"role":"assistant","content":null,"tool_calls":[{"type":"function","function":{"name":"calculate","arguments":"{\"expression\":25}"},"id":"gyHKa9Nu1RXaemgir1NbNwk2kJkhUN93"}]}}],"created":1754076609,"model":"gpt-3.5-turbo","system_fingerprint":"b6058-0f5ccd6f","object":"chat.completion","usage":{"completion_tokens":27,"prompt_tokens":444,"total_tokens":471},"id":"chatcmpl-vDHl8fm1kgOywcEFRWNN4ib6lQu3WwWu","timings":{"prompt_n":405,"prompt_ms":447.868,"prompt_per_token_ms":1.105846913580247,"prompt_per_second":904.2842980521045,"predicted_n":27,"predicted_ms":421.646,"predicted_per_token_ms":15.61651851851852,"predicted_per_second":64.0347590158569}}
---
Test 4: Complex parameters with nested objects
{"choices":[{"finish_reason":"tool_calls","index":0,"message":{"role":"assistant","content":null,"tool_calls":[{"type":"function","function":{"name":"send_email","arguments":"{\"to\":\"[email protected]\",\"subject\":\"Meeting Tomorrow\",\"body\":\"Hi John, just confirming our meeting scheduled for tomorrow. Best regards!\",\"attachments\":[]}"},"id":"0cbmSr1KCrjnI2i05u9d2kFtDL6mxy2i"}]}}],"created":1754076611,"model":"gpt-3.5-turbo","system_fingerprint":"b6058-0f5ccd6f","object":"chat.completion","usage":{"completion_tokens":67,"prompt_tokens":395,"total_tokens":462},"id":"chatcmpl-z1IWL8zTqJ0NYAUgT0QCyClvpDGYFSW1","timings":{"prompt_n":356,"prompt_ms":391.67,"prompt_per_token_ms":1.1001966292134833,"prompt_per_second":908.9284346516199,"predicted_n":67,"predicted_ms":1065.706,"predicted_per_token_ms":15.906059701492536,"predicted_per_second":62.86912150255324}}
---
Test 5: No tools needed scenario
{"choices":[{"finish_reason":"stop","index":0,"message":{"role":"assistant","content":"The capital of France is Paris."}}],"created":1754076611,"model":"gpt-3.5-turbo","system_fingerprint":"b6058-0f5ccd6f","object":"chat.completion","usage":{"completion_tokens":8,"prompt_tokens":290,"total_tokens":298},"id":"chatcmpl-uCvewROkNZys70V5XinqN3IX1au86OqK","timings":{"prompt_n":251,"prompt_ms":317.8,"prompt_per_token_ms":1.2661354581673308,"prompt_per_second":789.80490874764,"predicted_n":8,"predicted_ms":106.817,"predicted_per_token_ms":13.352125,"predicted_per_second":74.89444564067517}}
---
Test 6: Tool with enum parameter
{"choices":[{"finish_reason":"length","index":0,"message":{"role":"assistant","content":"<tool_call>\n<function=set_temperature>\n<parameter=temperature>\n72.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"}}],"created":1754076945,"model":"gpt-3.5-turbo","system_fingerprint":"b6058-0f5ccd6f","object":"chat.completion","usage":{"completion_tokens":16384,"prompt_tokens":339,"total_tokens":16723},"id":"chatcmpl-d1tMubhsORcuT4orGUXXflfsnYMPo5Qd","timings":{"prompt_n":300,"prompt_ms":375.494,"prompt_per_token_ms":1.2516466666666668,"prompt_per_second":798.9475198005827,"predicted_n":16384,"predicted_ms":333558.829,"predicted_per_token_ms":20.358815246582033,"predicted_per_second":49.11877178942848}}
---
Test 7: Tool call with reasoning
{"choices":[{"finish_reason":"tool_calls","index":0,"message":{"role":"assistant","content":null,"tool_calls":[{"type":"function","function":{"name":"get_weather_forecast","arguments":"{\"location\":\"Seattle\",\"date\":2023}"},"id":"99zJfjF9AbRY0E4xuxIkwEo41HP1V9JT"}]}}],"created":1754076946,"model":"gpt-3.5-turbo","system_fingerprint":"b6058-0f5ccd6f","object":"chat.completion","usage":{"completion_tokens":42,"prompt_tokens":341,"total_tokens":383},"id":"chatcmpl-rYYzcsvPxy662qJ1YUOW0Nwcp7J6ChKd","timings":{"prompt_n":302,"prompt_ms":413.176,"prompt_per_token_ms":1.3681324503311258,"prompt_per_second":730.9233837396171,"predicted_n":42,"predicted_ms":664.164,"predicted_per_token_ms":15.81342857142857,"predicted_per_second":63.2373931739751}}% |
#!/bin/bash
# Test 1: Simple single parameter function
echo "Test 1: Simple single parameter function"
curl -X POST http://127.0.0.1:9999/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-coder-30b-a3b-instruct",
"messages": [
{"role": "user", "content": "What is the current time?"}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Get the current time",
"parameters": {
"type": "object",
"properties": {
"timezone": {
"type": "string",
"description": "The timezone to get the time for"
}
},
"required": ["timezone"]
}
}
}
]
}'
echo -e "\n\n---\n"
# Test 2: Multiple parameters with different types
echo "Test 2: Multiple parameters with different types"
curl -X POST http://127.0.0.1:9999/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-coder-30b-a3b-instruct",
"messages": [
{"role": "user", "content": "Search for Python tutorials about machine learning"}
],
"tools": [
{
"type": "function",
"function": {
"name": "search_web",
"description": "Search the web for information",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query"
},
"max_results": {
"type": "integer",
"description": "Maximum number of results to return"
},
"safe_search": {
"type": "boolean",
"description": "Enable safe search filtering"
}
},
"required": ["query"]
}
}
}
]
}'
echo -e "\n\n---\n"
# Test 3: Multiple tools available
echo "Test 3: Multiple tools available"
curl -X POST http://127.0.0.1:9999/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-coder-30b-a3b-instruct",
"messages": [
{"role": "user", "content": "Calculate 25 * 4 and then convert the result to hexadecimal"}
],
"tools": [
{
"type": "function",
"function": {
"name": "calculate",
"description": "Perform mathematical calculations",
"parameters": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "The mathematical expression to evaluate"
}
},
"required": ["expression"]
}
}
},
{
"type": "function",
"function": {
"name": "convert_number",
"description": "Convert a number between different bases",
"parameters": {
"type": "object",
"properties": {
"number": {
"type": "integer",
"description": "The number to convert"
},
"from_base": {
"type": "integer",
"description": "The base to convert from (2-36)"
},
"to_base": {
"type": "integer",
"description": "The base to convert to (2-36)"
}
},
"required": ["number", "from_base", "to_base"]
}
}
}
]
}'
echo -e "\n\n---\n"
# Test 4: Complex parameters with nested objects
echo "Test 4: Complex parameters with nested objects"
curl -X POST http://127.0.0.1:9999/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-coder-30b-a3b-instruct",
"messages": [
{"role": "user", "content": "Send an email to [email protected] about the meeting tomorrow"}
],
"tools": [
{
"type": "function",
"function": {
"name": "send_email",
"description": "Send an email message",
"parameters": {
"type": "object",
"properties": {
"to": {
"type": "string",
"description": "Recipient email address"
},
"subject": {
"type": "string",
"description": "Email subject"
},
"body": {
"type": "string",
"description": "Email body content"
},
"attachments": {
"type": "array",
"description": "List of file paths to attach",
"items": {
"type": "string"
}
}
},
"required": ["to", "subject", "body"]
}
}
}
]
}'
echo -e "\n\n---\n"
# Test 5: No tools needed scenario
echo "Test 5: No tools needed scenario"
curl -X POST http://127.0.0.1:9999/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-coder-30b-a3b-instruct",
"messages": [
{"role": "user", "content": "What is the capital of France?"}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather information for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and country"
}
},
"required": ["location"]
}
}
}
]
}'
echo -e "\n\n---\n"
# Test 6: Tool with enum parameter
echo "Test 6: Tool with enum parameter"
curl -X POST http://127.0.0.1:9999/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-coder-30b-a3b-instruct",
"messages": [
{"role": "user", "content": "Set the temperature to 72 degrees"}
],
"tools": [
{
"type": "function",
"function": {
"name": "set_temperature",
"description": "Set the temperature of a device",
"parameters": {
"type": "object",
"properties": {
"temperature": {
"type": "number",
"description": "The temperature value"
},
"unit": {
"type": "string",
"description": "The temperature unit",
"enum": ["celsius", "fahrenheit", "kelvin"]
}
},
"required": ["temperature", "unit"]
}
}
}
]
}'
echo -e "\n\n---\n"
# Test 7: Tool call with reasoning
echo "Test 7: Tool call with reasoning"
curl -X POST http://127.0.0.1:9999/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-coder-30b-a3b-instruct",
"messages": [
{"role": "user", "content": "I need to know if it will rain tomorrow in Seattle. Can you check the weather forecast?"}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather_forecast",
"description": "Get weather forecast for a specific location and date",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city name"
},
"date": {
"type": "string",
"description": "The date in YYYY-MM-DD format"
}
},
"required": ["location", "date"]
}
}
}
]
}' |
Okay, it does work with ./llama-server --port 9999 --flash-attn --metrics --model Qwen3-Coder-30B-A3B-Instruct-GGUF/Qwen3-Coder-30B-A3B-Instruct-Q8_0.gguf \
--temp 0.7 \
--top-k 20 \
--top-p 0.7 \
--repeat-penalty 1.5 \
--presence_penalty 0.2 \
--n-predict 16384 \
--ctx-size 100000 \
--chat-template-file ../../models/templates/Qwen3-Coder.jinja \
--jinja ╰─ ./test_tool_calls.sh
Test 1: Simple single parameter function
{"choices":[{"finish_reason":"tool_calls","index":0,"message":{"role":"assistant","content":null,"tool_calls":[{"type":"function","function":{"name":"get_current_time","arguments":"{\"timezone\":\"UTC\"}"},"id":"75b5wIBMpPzb6GuJiIVL8WmSjTxv4dPw"}]}}],"created":1754078261,"model":"gpt-3.5-turbo","system_fingerprint":"b6058-0f5ccd6f","object":"chat.completion","usage":{"completion_tokens":23,"prompt_tokens":336,"total_tokens":359},"id":"chatcmpl-rYut7ARH9UuojO2rgC9abRHmPXl0b1cY","timings":{"prompt_n":336,"prompt_ms":415.761,"prompt_per_token_ms":1.2373839285714285,"prompt_per_second":808.1566092057697,"predicted_n":23,"predicted_ms":360.854,"predicted_per_token_ms":15.689304347826086,"predicted_per_second":63.737688926823594}}
---
Test 2: Multiple parameters with different types
{"choices":[{"finish_reason":"tool_calls","index":0,"message":{"role":"assistant","content":null,"tool_calls":[{"type":"function","function":{"name":"search_web","arguments":"{\"query\":\"Python Machine Learning tutorial\",\"max_results\":10,\"safe_search\":true}"},"id":"AI6eUIRsqwsTf4lmPVzM5pplsPhjpqfr"}]}}],"created":1754078262,"model":"gpt-3.5-turbo","system_fingerprint":"b6058-0f5ccd6f","object":"chat.completion","usage":{"completion_tokens":44,"prompt_tokens":397,"total_tokens":441},"id":"chatcmpl-QGyPqENZCvVG16BM0nMu9uNAZvEkfMQ8","timings":{"prompt_n":358,"prompt_ms":422.501,"prompt_per_token_ms":1.1801703910614525,"prompt_per_second":847.3352725792365,"predicted_n":44,"predicted_ms":791.181,"predicted_per_token_ms":17.981386363636364,"predicted_per_second":55.613064520002375}}
---
Test 3: Multiple tools available
{"choices":[{"finish_reason":"tool_calls","index":0,"message":{"role":"assistant","content":null,"tool_calls":[{"type":"function","function":{"name":"calculate","arguments":"{\"expression\":25}"},"id":"6Z9h8niBG6XHUSUuzrAjjC5aOS2xIq2a"}]}}],"created":1754078263,"model":"gpt-3.5-turbo","system_fingerprint":"b6058-0f5ccd6f","object":"chat.completion","usage":{"completion_tokens":27,"prompt_tokens":489,"total_tokens":516},"id":"chatcmpl-RNZIMerp9vfIgf6sri36IttP3zBZwcUT","timings":{"prompt_n":450,"prompt_ms":490.258,"prompt_per_token_ms":1.089462222222222,"prompt_per_second":917.8840528864395,"predicted_n":27,"predicted_ms":437.315,"predicted_per_token_ms":16.19685185185185,"predicted_per_second":61.74039308050262}}
---
Test 4: Complex parameters with nested objects
{"choices":[{"finish_reason":"tool_calls","index":0,"message":{"role":"assistant","content":null,"tool_calls":[{"type":"function","function":{"name":"send_email","arguments":"{\"to\":\"[email protected]\",\"subject\":\"Meeting Tomorrow\",\"body\":\"Hi John, just a reminder that we have our team meetiing scheduled for tomorrow. Best regards.\",\"attachments\":[]}"},"id":"qAUv2db10NkFpzSSkf095w7MVFeiYIkJ"}]}}],"created":1754078264,"model":"gpt-3.5-turbo","system_fingerprint":"b6058-0f5ccd6f","object":"chat.completion","usage":{"completion_tokens":74,"prompt_tokens":440,"total_tokens":514},"id":"chatcmpl-cYU4A50fQgpcreblZX9MOh7OJ4BJbD2i","timings":{"prompt_n":401,"prompt_ms":426.048,"prompt_per_token_ms":1.0624638403990025,"prompt_per_second":941.2085023283762,"predicted_n":74,"predicted_ms":1193.888,"predicted_per_token_ms":16.13362162162162,"predicted_per_second":61.982363504784374}}
---
Test 5: No tools needed scenario
{"choices":[{"finish_reason":"tool_calls","index":0,"message":{"role":"assistant","content":"The capital ofFranceis Paris.\n","tool_calls":[{"type":"function","function":{"name":"get_weather","arguments":"{\"location\":\"Paris, france\"}"},"id":"cHYC6Sez5SgbzGPAJJ0qVHvVJC6faLbX"}]}}],"created":1754078265,"model":"gpt-3.5-turbo","system_fingerprint":"b6058-0f5ccd6f","object":"chat.completion","usage":{"completion_tokens":31,"prompt_tokens":335,"total_tokens":366},"id":"chatcmpl-GQTiQ2H2esGeiwA6sGxqo3YYYwQZwkrt","timings":{"prompt_n":296,"prompt_ms":361.213,"prompt_per_token_ms":1.2203141891891893,"prompt_per_second":819.461093592977,"predicted_n":31,"predicted_ms":510.389,"predicted_per_token_ms":16.464161290322583,"predicted_per_second":60.7379861243091}}
---
Test 6: Tool with enum parameter
{"choices":[{"finish_reason":"tool_calls","index":0,"message":{"role":"assistant","content":null,"tool_calls":[{"type":"function","function":{"name":"set_temperature","arguments":"{\"temperature\":72,\"unit\":\"fahrenheit\"}"},"id":"AePf5xOZZ5neOKCf9EFQwEw3SaFi7WM0"}]}}],"created":1754078267,"model":"gpt-3.5-turbo","system_fingerprint":"b6058-0f5ccd6f","object":"chat.completion","usage":{"completion_tokens":72,"prompt_tokens":384,"total_tokens":456},"id":"chatcmpl-vDwwx9lTc3Bo83wAMUQAXFEp6JUwbB0L","timings":{"prompt_n":345,"prompt_ms":386.195,"prompt_per_token_ms":1.1194057971014493,"prompt_per_second":893.3310892165874,"predicted_n":72,"predicted_ms":1251.584,"predicted_per_token_ms":17.383111111111113,"predicted_per_second":57.52710165678052}}
---
Test 7: Tool call with reasoning
{"choices":[{"finish_reason":"tool_calls","index":0,"message":{"role":"assistant","content":null,"tool_calls":[{"type":"function","function":{"name":"get_weather_forecast","arguments":"{\"location\":\"Seattle\",\"date\":2023}"},"id":"zm3TKKIFa6U1ZoL47n9Zpw6J7j7HjdnQ"}]}}],"created":1754078268,"model":"gpt-3.5-turbo","system_fingerprint":"b6058-0f5ccd6f","object":"chat.completion","usage":{"completion_tokens":42,"prompt_tokens":386,"total_tokens":428},"id":"chatcmpl-mA8Xy5VQFj9K43j8QauyWD7tvi0GGSG8","timings":{"prompt_n":347,"prompt_ms":401.396,"prompt_per_token_ms":1.1567608069164266,"prompt_per_second":864.482954488834,"predicted_n":42,"predicted_ms":642.277,"predicted_per_token_ms":15.292309523809525,"predicted_per_second":65.3923462929546}}% |
Remove this. This model has native 256k context window. |
It's also kinda weird that it only tries to write |
I think there's still an issue with the parser not handling the weird format (whitespaces and newlines) which seem to occur when the context window is filled up a bit (30k+). I'll fix it asap. Nonetheless, the model also appears behave weird. But step by step. The only real issue with the model, that I'm sure has nothing to do with the parser is the repetitive number issue. |
I'm pretty sure that's just because "something" is eating the rest of the include directive... in the end everything might be fine. Currently l, there are multiple possible errors sources. 🤣 |
#14962 does make everything "work". It still prints out the tool calls and switching randomly between text + tool call, XML and JSON but I guess that's just the model and/or quant being weird. ![]() |
@bold84 Thanks for the effort! Hopefully your XML work turns out to be helpful as well eventually. 😊 |
Using the Hermes template, on which the model wasn't trained, makes everything work. The question is how well. See: https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct/blob/main/qwen3coder_tool_parser.py If you benchmark both templates, I would not be surprised if you would see lower performance with the Hermes template. |
I tried merging master/#14962 with this PR but it's back to swallowing stuff after the |
I have a few more changes i haven't pushed yet. But one thing is sure, the 30B model isn't strong with tool calls when there's 30k+ tokens in the context window. |
Could this be the issue with the swallowing? Like it's parsing stuff and then the |
It's definitely something about |
I used the model with vllm a bit more, FP8 quant and Qwen's python tool parser. With crush and opencode. And as you can see, it's tricky to debug this. This is the main reason I decided to close this PR, there's no real value in it, if Qwen3-Coder 30B A3B isn't capable enough. |
I'm just hoping for a significant update or fix from Unsloth and/or Qwen. The non-coder non-thinking 2507 30B seemed to work fine for my small snake game in C and ncursus test. |
Synced the PR with master and the bundled template with upstream. @ggerganov This works flawlessly with the upstream recommended parameters and template (disregard the disclaimers in the PR). Would be nice to get this reviewed and merged in. 😊
![]() |
i'll give my 2-cents as a user here... Ofc a base But Qwen Coder is a major usecase right now, and Llama.cpp is one of the main backends... if its going to be a long delay to get this working correctly (which looks like it is) then why not patch the working solution above a a stop gap. and unpatch it if and when the clean solution materializes, that way from user perspective its the most accesible and useful thanks |
Let's see if @ochafik will have some capacity to chime in on the change. In the meantime, we should see if anyone would be interested in "adopting" the |
@ggerganov whoops, seems like merging master into this triggered an auto review request |
Using this branch with latest jinja and
I'm using qwen-code CLI, in case that matters at all. Here is my start command for llama-server:
GPU is RTX 5090 |
@Xenograph I haven't experienced any crashes at all. Maybe try without the |
That might've been it, no crashes so far. |
Spoke too soon, it took awhile but I eventually got another crash with the same message. This time with standard jinja.
|
Hi, I also experienced crashes with the latest commit. Weirdly, there's no error message at all. Is there any debug option? My command:
The error shown in qwen-code:
The output from llama-server:
And then it just crashed... |
I just tried it with qwen-code and it's working fine. Maybe check the sha256sum of the GGUF? qwen-code output
|
Yeah my gguf is not the latest version (They updated too many times) . It seems the crash issue is gone now, I'll check with more cases. |
Hey guys, could you confirm it remains usable past ~30k tokens? Because that's always been the issue I personally faced with various templates, when using as agent (mainly roocode and opencode in my case). Works like a charm before reaching this approximate limit, then degrades and become non usable, for example writing CoTs as comments in the files... |
@blakkd I think you're better off testing this yourself through an API with openrouter or something. Too many variables to make a decision on context related stuff. |
Looks like my gguf is indeed latest. Still experiencing the aforementioned crashes though.
|
@Xenograph alright. Well, I don't know. If you can find a way to reproduce it easily then I can take a look at it I guess. |
Let me know if there is anything I can do to get more detailed information to you that might help to root cause the crash when it occurs again. Happy to run a build that has some extra logging or something that may be helpful. I don't foresee myself being able to provide a recipe to reproduce, it happens quite randomly AFAICT. |
@marceldev89 Thanks for your reply! |
I'n my tool enabled ai interface im getting this error: After Qwen3-Coder-30B-A3B-Instruct-480B-Distill-V2-Q5_K_S.gguf makes about 11 tool calls in a row and then responds again in my interface and then starts updating a python file. I think right when it tries writing the file is when this occurs, so I guess the ~12th consecutive tool call. I am using the latest main clone I just compiled minutes ago (I dont know git terminology sorry). This " what(): Invalid diff: now finding less tool calls!" only seems to occur after making over ten consecutive tool calls. Is that related at all to the issues you are discussing here in this thread? the llm is convinced I will have to limit # of tool calls per turn which means I will have to revise things in my backend if that is the case. Can anyone point me to a relevant thread on this topic? |
@ionizing-plasma if you want to try this PR, you have to checkout to it, the main branch isn't including the changes made here. I'm kinda new to git too, but I think you can go like this:
Then build. That's what I retained from there ;) #15676 (comment) |
@Xenograph I guess you could run it with @blakkd I meant that you could test with a cloud hosted model to see if you see the same context problems. I haven't noticed anything weird with context so far but I try to keep it short. @ionizing-plasma did you clone/switch to this branch and recompile it? Also those distills are "bad" apparently. Seen some reports a few days ago that the weights are basically the same as the original. |
@marceldev89 No I have not cloned to this branch, but I greatly appreciate the advice and information you and @blakkd gave. I am stubborn, and was convinced the root cause lay in my backend and how I was constructing and sending requests and tool information to llama-server. I kept at troubleshooting all evening last night and all this morning, and after discovering that I was indeed adding an extra assistant message in both the in-turn in-memory conversation history AND the database stored conversation history, I rectified that by properly using both the content and tool_call tags instead of using a separate assistant message for each. I'm sure the bug will resurface now that I am reporting preliminary results lol, but in testing so far today, my application is now breaking its previous best records for number of autonomous tool calls and actions WITHOUT getting the spurious "terminate called after throwing an instance of 'std::runtime_error'", Previously, it would get to about 12 or 13 autonomous calls before that occurred, but as I type this message right now literally, it has been running for over 15 minutes straight with no user interaction, and has made OVER 35 successful tool calls including shell commands, file writes and reads, and updating my internal todo list which it uses to keep track of development. Wow honestly this is monumental and is the farthest it has even gone without error of SOME sort. I have been developing this since about June in different iterations and have finally begun to achieve autonomous action using local llm. I am still testing with the same Qwen3-Coder-30B-A3B-Instruct-480B-Distill-V2.gguf as before. I do suspect I still have issues with its internal jinja template and will be looking into that next, but I do think a major problem in my backend has at least been solved. Sorry for the excitement, this has been months of work to get to this point AND it still seems to work with the main branch of llama.cpp that I compiled yesterday. Hmm it does seem to be stuck in a bit of a loop here ohhh wait as I was typing this update it finally stopped lol. it does seem to repeat and reword summaries a lot near the end as it updates the todo list and is trying to decide if all the tasks are complete, but it finally stopped and made a whopping 45 tool calls during its single turn generation which lasted for 23 minutes straight! This is a good start, I can work with this and keep improving the systems, there are still numerous other issues to work out. Anyhow I guess my suggestion for others who might be encountering "terminate called after throwing an instance of 'std::runtime_error'", "what(): Invalid diff: now finding less tool calls!" is to double check how you are constructing your assistant messages and tool results and make sure you are using the tags properly and not sending separate assistant messages that split conversation content and tool results. Edited to add screenshot. please excuse the ugly gui and tool call fragments showing in the chat ui itself, that will be fixed later and are a result of testing a new jinja template. Since posting this I have kept it going and it STILL has not ran into the errors I was having last night. If this thing actually autonomously fixes and compiles and runs this application I will be amazed... but its looking like it might succeed: |
@ionizing-plasma If I remember correctly the |
Yeah this message basically means the following: The entire streaming partial parsing code is extremely messy, I do recommend doing what I did in the Nemotron (among others) template as a workaround, i.e. disabling streaming partial tool calls completely (only stream the tool call once it's completely done). It needs a solid refactoring. |
@Xenograph I just ran into the |
Pushed a quick fix for that specific crash. Not sure why but the model stops writing |
This pull request resolves #15012 and introduces comprehensive support for the Qwen3-Coder model family's XML-based tool-calling format. It includes a new, robust XML parser and updated chat template detection logic to ensure reliable function calling.
Key Changes:
New XML Parser (
common/chat-parser.cpp
):Chat Template Detection (
common/chat.h
,common/chat.cpp
):QWEN3_CODER_XML
format is applied consistently, even when no tools are explicitly provided in the request.Comprehensive tests (
tests/test-chat.cpp
):Usage Notes:
Model Behavior: During testing, it was observed that the 30B verison of the Qwen3-Coder models can sometimes generate numeric values with excessive precision (e.g.,
72.0000...
). This is a model-level sampling issue, not a bug in the implementation. It can be effectively mitigated by using the following sampling parameters:Model Performance: The implementation has been tested only with the 30B parameter model (different 8 bit quantizations). The 30B gguf variant of Qwen3-Coder has shown inconsistent performance with tool calling and may not function reliably. I tested the FP8 variant of the model with vllm and there were no issues at all.