Thinks for your amazing work! However, the inference latency is too long.
When the steps=128, the inference latency is approximately 10s on single H100.
Reducing the steps can shorten the inference time, but it also degrades the model’s output quality.
Is there any acceleration methods that can improve inference speed without sacrificing model accuracy?
Thinks for your amazing work! However, the inference latency is too long.
When the
steps=128, the inference latency is approximately10son single H100.Reducing the
stepscan shorten the inference time, but it also degrades the model’s output quality.Is there any acceleration methods that can improve inference speed without sacrificing model accuracy?