-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
System Info
cpu:x86
gpu:spark
docker: tensorrt-llm:1.2.0rc2
logs as following:
[11/17/2025-10:59:20] [TRT-LLM] [E] Failed to initialize executor on rank 0: 'Qwen2VLInputProcessorBase' object has no attribute 'image_max_dim'
[11/17/2025-10:59:20] [TRT-LLM] [E] Traceback (most recent call last):
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/worker.py", line 362, in worker_main
worker: GenerationExecutorWorker = worker_cls(
^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/worker.py", line 64, in init
self.setup_engine()
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/base_worker.py", line 177, in setup_engine
self.engine = _create_py_executor(
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/base_worker.py", line 148, in _create_py_executor
_executor = create_executor(**args)
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/py_executor_creator.py", line 600, in create_py_executor
estimating_kv_cache = kv_cache_creator.try_prepare_estimation()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/_util.py", line 283, in try_prepare_estimation
self._kv_cache_config.max_tokens = self._get_token_num_for_estimation(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/_util.py", line 263, in _get_token_num_for_estimation
self._dummy_reqs = self._create_dummy_context_requests(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/_util.py", line 204, in _create_dummy_context_requests
requests = self._create_dummy_mm_context_request(input_seq_len)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/_util.py", line 155, in _create_dummy_mm_context_request
dummy_mm_prompt = input_processor.get_dummy_prompt(input_seq_len)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/inputs/registry.py", line 334, in get_dummy_prompt
while self.image_max_dim >= self.img_min_dim:
^^^^^^^^^^^^^^^^^^
AttributeError: 'Qwen2VLInputProcessorBase' object has no attribute 'image_max_dim'
[11/17/2025-10:59:20] [TRT-LLM] [I] get signal from executor worker
[11/17/2025-10:59:20] [TRT-LLM] [E] Executor worker initialization error: Traceback (most recent call last):
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/worker.py", line 362, in worker_main
worker: GenerationExecutorWorker = worker_cls(
^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/worker.py", line 64, in init
self.setup_engine()
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/base_worker.py", line 177, in setup_engine
self.engine = _create_py_executor(
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/base_worker.py", line 148, in _create_py_executor
_executor = create_executor(**args)
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/py_executor_creator.py", line 600, in create_py_executor
estimating_kv_cache = kv_cache_creator.try_prepare_estimation()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/_util.py", line 283, in try_prepare_estimation
self._kv_cache_config.max_tokens = self._get_token_num_for_estimation(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/_util.py", line 263, in _get_token_num_for_estimation
self._dummy_reqs = self._create_dummy_context_requests(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/_util.py", line 204, in _create_dummy_context_requests
requests = self._create_dummy_mm_context_request(input_seq_len)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/pyexecutor/_util.py", line 155, in _create_dummy_mm_context_request
dummy_mm_prompt = input_processor.get_dummy_prompt(input_seq_len)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/inputs/registry.py", line 334, in get_dummy_prompt
while self.image_max_dim >= self.img_min_dim:
^^^^^^^^^^^^^^^^^^
AttributeError: 'Qwen2VLInputProcessorBase' object has no attribute 'image_max_dim'
AttributeError: 'Qwen2VLInputProcessorBase' object has no attribute 'image_max_dim'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/bin/trtllm-serve", line 7, in
sys.exit(main())
^^^^^^
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1462, in call
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1383, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1850, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1246, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 814, in invoke
return callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/commands/serve.py", line 436, in serve
launch_server(host, port, llm_args, tool_parser, metadata_server_cfg,
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/commands/serve.py", line 163, in launch_server
llm = PyTorchLLM(**llm_args)
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/llmapi/llm.py", line 1115, in init
super().init(model, tokenizer, tokenizer_mode, skip_tokenizer_init,
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/llmapi/llm.py", line 1001, in init
super().init(model,
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/llmapi/llm.py", line 229, in init
self._build_model()
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/llmapi/llm.py", line 1062, in _build_model
self._executor = self._executor_cls.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/executor.py", line 549, in create
return GenerationExecutor._create_ipc_executor(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/executor.py", line 420, in _create_ipc_executor
return GenerationExecutorProxy(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/proxy.py", line 103, in init
self._start_executor_workers(worker_kwargs)
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/executor/proxy.py", line 344, in _start_executor_workers
raise RuntimeError(
RuntimeError: Executor worker returned error
[W1117 10:59:23.537940364 AllocatorConfig.cpp:28] Warning: PYTORCH_CUDA_ALLOC_CONF is deprecated, use PYTORCH_ALLOC_CONF instead (function operator())
[W1117 10:59:23.537927868 AllocatorConfig.cpp:28] Warning: PYTORCH_CUDA_ALLOC_CONF is deprecated, use PYTORCH_ALLOC_CONF instead (function operator())
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
run docker and then trtllm-serve serve Qwen2.5-VL-7B-Instruct-FP8
Expected behavior
should be served.
actual behavior
error happened.
AttributeError: 'Qwen2VLInputProcessorBase' object has no attribute 'image_max_dim'
additional notes
AttributeError: 'Qwen2VLInputProcessorBase' object has no attribute 'image_max_dim'
Before submitting a new issue...
- Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.