-
Notifications
You must be signed in to change notification settings - Fork 27
Open
Description
Awesome work! I was wondering about the inference speed for the proposed model, as I see in the paper, the inference speed is tested on MT-Bench with a single 40GB A100.
Do you have an estimation of the inference speed on other machines? I am also quite curious of how the inference time expands with length of generated sequence.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels