[Question]: Nvidia Triton/TensorRT support #8447
Replies: 2 comments
-
🤖 Hi there, Currently, I'm unable to find any specific information regarding Nvidia Triton/TensorRT support in the LlamaIndex repository. To better assist you, could you please provide more details about your use case? Specifically, it would be helpful to know:
The more information you can provide, the better I can assist you. Thanks for your patience and cooperation. This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
-
To increase the inference speed I want to use Triton server.So even if Model is hosted on Triton how do I integrate it as llm in llama index? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Question Validation
Question
Is there any support to wrap Triton server/TensorRT hosted models in llamaindex ? Not able to find enough documentation or support for this.Thankyou.
Beta Was this translation helpful? Give feedback.
All reactions