Skip to content

Support Tool Calling with transformers 5.x for GLM-4.6V Models #240

@tamthaihoangminh

Description

@tamthaihoangminh

Feature request / 功能建议

Sorry but please allow me to use some AI to summary my findings and request. Im not sure who should I send this feature request to vllm team, transformer team, zai team? :D May be with your connection you can onboard more developer using your surprisingly good model

Request: Add support for tool calling (glm45 parser) when using transformers 5.x with GLM-4.6V models.

Current Limitation: GLM-4.6V models require transformers 5.0.0rc1 for vision/multimodal support, but vLLM v0.13's tool calling parser (glm45) is incompatible with transformers 5.x, forcing users to choose between tool calling OR vision.

Environment

  • vLLM Version: v0.13.0
  • Model: zai-org/GLM-4.6V-FP8, zai-org/GLM-4.6V-Flash
  • Transformers 4.56.0: Tool calling works, vision not supported
  • Transformers 5.0.0rc1: Vision works, tool calling broken
  • Tool Call Parser: glm45
  • GPU: 2x NVIDIA RTX 6000 Pro (96GB each)

Current Behavior

With transformers 4.56.0 (Tool Calling Works)

vllm serve zai-org/GLM-4.6V-FP8 \
  --tool-call-parser glm45 \
  --limit-mm-per-prompt '{"image": 0, "video": 0}'

Response includes tool_calls:

{
  "choices": [{
    "message": {
      "tool_calls": [{
        "id": "call_abc",
        "function": {
          "name": "get_weather",
          "arguments": "{\"location\": \"Tokyo\"}"
        }
      }]
    }
  }]
}

With transformers 5.0.0rc1 (Tool Calling Broken)

pip install --upgrade 'transformers==5.0.0rc1'
vllm serve zai-org/GLM-4.6V-FP8 \
  --tool-call-parser glm45 \
  --limit-mm-per-prompt '{"image": 1, "video": 0}'

Response has tool_calls: null:

{
  "choices": [{
    "message": {
      "content": "I should use the get_weather function...",
      "tool_calls": null
    }
  }]
}

Why transformers 5.0.0rc1 is Required

GLM-4.6V models have a multimodal vision processor that only exists in transformers 5.0.0rc1:

Error: Cannot find module 'transformers.models.glm.visualizing_glm'

The GLMVisionProcessor class was added in transformers 5.0.0rc1 and is required for image support.

Motivation / 动机

Just let the team knew something broken but not a bug from zai :D rather an ecosystem

Your contribution / 您的贡献

No

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions