Support for AsyncLLMEngine #2224

Closed Unanswered

agunapal asked this question in Q&A

agunapal
Sep 13, 2024

Hello,

I believe AsyncLLMEngine was removed sometime ago.

I was wondering if there is support for loading a built engine and support asynchronous inference requests.

I have seen that we have this https://github.com/NVIDIA/TensorRT-LLM/blob/main/tests/hlapi/test_llm.py#L288 but this is not for the built engine?

Replies: 1 comment

agunapal
Sep 13, 2024
Author

I have verified that model can be LLMEngine.

0 replies

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment