Intermittent 11s Latency Spikes on Nova-3 (Timeout/Retry Signature) #1512
Replies: 5 comments 16 replies
-
|
Thanks for asking your question. Please be sure to reply with as much detail as possible so the community can assist you efficiently. |
Beta Was this translation helpful? Give feedback.
-
|
Hey there! It looks like you haven't connected your GitHub account to your Deepgram account. You can do this at https://community.deepgram.com - being verified through this process will allow our team to help you in a much more streamlined fashion. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
|
the real time streaming API also has the same slowdown |
Beta Was this translation helpful? Give feedback.
-
|
Same problem here. Hope there's a solution soon. Using Nova-3 Medical |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hey team! I'm seeing a significant number of requests to Nova-3 hitting a massive latency wall over the last couple of weeks.
The Issue: Requests are intermittently hanging for exactly ~10.5 to 11.5 seconds.
Typical latency: ~0.8s - 1.2s.
Slow requests: Consistently ~11 seconds.
This suggests an internal 10-second backend timeout followed by a successful retry.
Request IDs (captured Dec 30):
e11fab7c-785f-4df1-9727-dd5aaa4cdb98 (11.58s)
2c902511-8a7b-4dd9-a1e1-695a0da73e9e (11.18s)
3dffd5c9-d3e9-42ed-be26-e7a8323a4be6 (11.16s)
What I've Observed/Ruled Out:
Intermittent but Reproducible: In stress tests, we hit this "11s wall" roughly 5-10% of the time.
Not Parameter Specific: Reproduced on "bare bones" requests (just model + language) and full feature requests (smart_format, utterances, paragraphs).
Not Format Specific: Reproduced on both WAV and FLAC files.
Not Bandwidth: Test files are very small (~300KB).
Model Agnostic: Reproduced on both nova-3 and nova-2.
Usage Context (Python): I am using the Python requests library to POST to the /v1/listen endpoint.
Python
Simplified usage:
params = {
"model": "nova-3",
"smart_format": "true",
"language": "en",
"paragraphs": "true"
}
Sending small buffers (<400KB) via POST with appropriate headers
This is causing a poor UX for users who expect "Nova" speed.
Can you check the logs for these Request IDs to see what is causing the internal stall?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions