[Bug]: failed to run qwen2.5-vl-7b-fp8

### System Info

cpu:x86
gpu:spark
docker: tensorrt-llm:1.2.0rc2

logs as following:
[11/17/2025-10:59:20] [TRT-LLM] [E] Failed to initialize executor on rank 0: 'Qwen2VLInputProcessorBase' object has no attribute 'image_max_dim'
[11/17/2025-10:59:20] [TRT-LLM] [E] Traceback (most recent call last):
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/worker.py", line 362, in worker_main
    worker: GenerationExecutorWorker = worker_cls(
                                       ^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/worker.py", line 64, in __init__
    self.setup_engine()
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/base_worker.py", line 177, in setup_engine
    self.engine = _create_py_executor(
                  ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/base_worker.py", line 148, in _create_py_executor
    _executor = create_executor(**args)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/py_executor_creator.py", line 600, in create_py_executor
    estimating_kv_cache = kv_cache_creator.try_prepare_estimation()
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/_util.py", line 283, in try_prepare_estimation
    self._kv_cache_config.max_tokens = self._get_token_num_for_estimation(
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/_util.py", line 263, in _get_token_num_for_estimation
    self._dummy_reqs = self._create_dummy_context_requests(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/_util.py", line 204, in _create_dummy_context_requests
    requests = self._create_dummy_mm_context_request(input_seq_len)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/_util.py", line 155, in _create_dummy_mm_context_request
    dummy_mm_prompt = input_processor.get_dummy_prompt(input_seq_len)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/inputs/registry.py", line 334, in get_dummy_prompt
    while self.image_max_dim >= self.img_min_dim:
          ^^^^^^^^^^^^^^^^^^
AttributeError: 'Qwen2VLInputProcessorBase' object has no attribute 'image_max_dim'

[11/17/2025-10:59:20] [TRT-LLM] [I] get signal from executor worker
[11/17/2025-10:59:20] [TRT-LLM] [E] Executor worker initialization error: Traceback (most recent call last):
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/worker.py", line 362, in worker_main
    worker: GenerationExecutorWorker = worker_cls(
                                       ^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/worker.py", line 64, in __init__
    self.setup_engine()
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/base_worker.py", line 177, in setup_engine
    self.engine = _create_py_executor(
                  ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/base_worker.py", line 148, in _create_py_executor
    _executor = create_executor(**args)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/py_executor_creator.py", line 600, in create_py_executor
    estimating_kv_cache = kv_cache_creator.try_prepare_estimation()
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/_util.py", line 283, in try_prepare_estimation
    self._kv_cache_config.max_tokens = self._get_token_num_for_estimation(
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/_util.py", line 263, in _get_token_num_for_estimation
    self._dummy_reqs = self._create_dummy_context_requests(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/_util.py", line 204, in _create_dummy_context_requests
    requests = self._create_dummy_mm_context_request(input_seq_len)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/_util.py", line 155, in _create_dummy_mm_context_request
    dummy_mm_prompt = input_processor.get_dummy_prompt(input_seq_len)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/inputs/registry.py", line 334, in get_dummy_prompt
    while self.image_max_dim >= self.img_min_dim:
          ^^^^^^^^^^^^^^^^^^
AttributeError: 'Qwen2VLInputProcessorBase' object has no attribute 'image_max_dim'

AttributeError: 'Qwen2VLInputProcessorBase' object has no attribute 'image_max_dim'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/trtllm-serve", line 7, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1462, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1383, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1850, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1246, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 814, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/commands/serve.py", line 436, in serve
    launch_server(host, port, llm_args, tool_parser, metadata_server_cfg,
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/commands/serve.py", line 163, in launch_server
    llm = PyTorchLLM(**llm_args)
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/llmapi/llm.py", line 1115, in __init__
    super().__init__(model, tokenizer, tokenizer_mode, skip_tokenizer_init,
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/llmapi/llm.py", line 1001, in __init__
    super().__init__(model,
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/llmapi/llm.py", line 229, in __init__
    self._build_model()
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/llmapi/llm.py", line 1062, in _build_model
    self._executor = self._executor_cls.create(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/executor.py", line 549, in create
    return GenerationExecutor._create_ipc_executor(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/executor.py", line 420, in _create_ipc_executor
    return GenerationExecutorProxy(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/proxy.py", line 103, in __init__
    self._start_executor_workers(worker_kwargs)
  File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/proxy.py", line 344, in _start_executor_workers
    raise RuntimeError(
RuntimeError: Executor worker returned error
[W1117 10:59:23.537940364 AllocatorConfig.cpp:28] Warning: PYTORCH_CUDA_ALLOC_CONF is deprecated, use PYTORCH_ALLOC_CONF instead (function operator())
[W1117 10:59:23.537927868 AllocatorConfig.cpp:28] Warning: PYTORCH_CUDA_ALLOC_CONF is deprecated, use PYTORCH_ALLOC_CONF instead (function operator())


### Who can help?

_No response_

### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

run docker and then trtllm-serve serve Qwen2.5-VL-7B-Instruct-FP8

### Expected behavior

should be served.

### actual behavior

error happened.

AttributeError: 'Qwen2VLInputProcessorBase' object has no attribute 'image_max_dim'

### additional notes


AttributeError: 'Qwen2VLInputProcessorBase' object has no attribute 'image_max_dim'

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and checked the [documentation](https://nvidia.github.io/TensorRT-LLM/) and [examples](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples) for answers to frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: failed to run qwen2.5-vl-7b-fp8 #9219

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: failed to run qwen2.5-vl-7b-fp8 #9219

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions