Replies: 7 comments 2 replies
-
I've also noticed this - in testing on Friday the Sonar model was returning responses with an average of 6.6s but today the same requests are taking 10-15 seconds across the board |
Beta Was this translation helpful? Give feedback.
-
I've just sent Perplexity an email. 20 seconds to complete via API and 2.54 seconds in the playground. I chose Sonar for lightning quick responses to fairly simple requests. |
Beta Was this translation helpful? Give feedback.
-
+1 for this feature request, real need for a 7/8B parameter model for low latency responses, or the existing sonar model inference significantly increased. |
Beta Was this translation helpful? Give feedback.
-
Bumping this - The docs seem a little sparse in terms of what tweaks we can make to the params to try to drive latency down. Would love to know how to make sure sonar responses are < 5s |
Beta Was this translation helpful? Give feedback.
-
Currently, |
Beta Was this translation helpful? Give feedback.
-
Comparing "llama-3.1-sonar-small-128k-online" and "sonar", "sonar" is slower in streaming responses. I would like the streaming speed to be improved. |
Beta Was this translation helpful? Give feedback.
-
My requests used to take ~5 seconds with the |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
The response speed of "sonar", the new model from Perplexity, is much slower than "llama-3.1-sonar-small-128k-online".
Will it improve?
Beta Was this translation helpful? Give feedback.
All reactions