How to input multimodal embeddings to an LLM #5807

RuABraun · 2025-07-07T19:57:33Z

RuABraun
Jul 7, 2025

I'm using this function to try and pass multimodal embeddings: https://github.com/NVIDIA/TensorRT-LLM/blob/release/0.20/tensorrt_llm/llmapi/llm.py#L341

I'm assuming the fusing behaviour to be like here: https://github.com/NVIDIA/TensorRT-LLM/blob/release/0.20/tensorrt_llm/_torch/models/modeling_multimodal_utils.py#L31

Is my thinking accurate? Based on what nonsense my model outputs I assume no. I have not been able to find relevant docs or figure out where the C++ is

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to input multimodal embeddings to an LLM #5807

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

How to input multimodal embeddings to an LLM #5807

Uh oh!

Uh oh!

RuABraun Jul 7, 2025

Replies: 0 comments

RuABraun
Jul 7, 2025