-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Open
Labels
feature requestNew feature or request. This includes new model, dtype, functionality supportNew feature or request. This includes new model, dtype, functionality support
Description
๐ The feature, motivation and pitch
When using TRT-LLM in production it occasionally may crash or hang from some request combinations. However, it is currently very hard to reproduce such failures, because there is no way to learn, which requests were in the batch, when such a crash happened.
I propose adding a config parameter to enable dumping the current requests to a file, that would be triggered in the events of crashes, and when the external system is killing the worker due to it failing the health checks.
cc @ltalal
Alternatives
Setting up a proxy, which is fully aware of all the states of all the requests being executed on all the instances.
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
feature requestNew feature or request. This includes new model, dtype, functionality supportNew feature or request. This includes new model, dtype, functionality support