Intermittent 11s Latency Spikes on Nova-3 (Timeout/Retry Signature) #1512

ryanshrott · 2025-12-30T18:39:28Z

ryanshrott
Dec 30, 2025

Hey team! I'm seeing a significant number of requests to Nova-3 hitting a massive latency wall over the last couple of weeks.

The Issue: Requests are intermittently hanging for exactly ~10.5 to 11.5 seconds.

Typical latency: ~0.8s - 1.2s.

Slow requests: Consistently ~11 seconds.

This suggests an internal 10-second backend timeout followed by a successful retry.

Request IDs (captured Dec 30):

e11fab7c-785f-4df1-9727-dd5aaa4cdb98 (11.58s)

2c902511-8a7b-4dd9-a1e1-695a0da73e9e (11.18s)

3dffd5c9-d3e9-42ed-be26-e7a8323a4be6 (11.16s)

What I've Observed/Ruled Out:

Intermittent but Reproducible: In stress tests, we hit this "11s wall" roughly 5-10% of the time.

Not Parameter Specific: Reproduced on "bare bones" requests (just model + language) and full feature requests (smart_format, utterances, paragraphs).

Not Format Specific: Reproduced on both WAV and FLAC files.

Not Bandwidth: Test files are very small (~300KB).

Model Agnostic: Reproduced on both nova-3 and nova-2.

Usage Context (Python): I am using the Python requests library to POST to the /v1/listen endpoint.

Python

Simplified usage:

params = {
"model": "nova-3",
"smart_format": "true",
"language": "en",
"paragraphs": "true"
}

Sending small buffers (<400KB) via POST with appropriate headers

This is causing a poor UX for users who expect "Nova" speed.

Can you check the logs for these Request IDs to see what is causing the internal stall?

Thanks!

Taltzipi · 2025-12-30T18:39:31Z

deepgram-community[bot]
bot Dec 30, 2025

Thanks for asking your question. Please be sure to reply with as much detail as possible so the community can assist you efficiently.
_{Consider joining our Discord community for more opportunity to engage with your fellow Deepgram users. You can earn points which can be redeemed for cool stuff by being active in our communities!}

1 reply

Taltzipi Jan 2, 2026

Same here.

https://github.com/orgs/deepgram/discussions/1504

ryanshrott · 2025-12-30T18:39:47Z

deepgram-community[bot]
bot Dec 30, 2025

Hey there! It looks like you haven't connected your GitHub account to your Deepgram account. You can do this at https://community.deepgram.com - being verified through this process will allow our team to help you in a much more streamlined fashion.

1 reply

ryanshrott Dec 30, 2025
Author

done

ryanshrott · 2025-12-30T19:28:34Z

deepgram-community[bot]
bot Dec 30, 2025

while our batch (pre-recorded) offerings are typically very fast, this can sometimes vary based on overall system utilization, and we do not guarantee a processing speed SLA as of today on hosted pre-recorded STT. for use cases that require realtime guarantees, we recommend streaming with Nova-3 or Flux.

that said this is something we're looking into offering as a premium batch product, so I'd love to hear more about your use case for this sort of 'fast batch'!

This message was sent by nick kaimakis from Deepgram, via our community automation.

4 replies

ryanshrott Dec 30, 2025
Author

Ok, but I was seeing speeds of 1-2 seconds for 6 months prior. The 11 second lag is a new issue, which I have only seen as of 2 weeks ago.

ryanshrott Dec 30, 2025
Author

My use case does not support streaming API.

ryanshrott Dec 30, 2025
Author

I switched to AssemblyAI for now till I can figure this out. I prefer the deepgram model, but the crazy high random spikes in processing time don;t work for me.

ryanshrott Dec 30, 2025
Author

https://www.reddit.com/r/speechtech/comments/1ppjo40/comment/nunf73s/

ryanshrott · 2025-12-31T13:35:38Z

ryanshrott
Dec 31, 2025
Author

the real time streaming API also has the same slowdown

8 replies

ryanshrott Jan 2, 2026
Author

Appreciate the request ID's - I've verified they all appear to have what looks like a 10 second delay with these batch requests.

I'm going to follow up internally on this with our inference team for investigation, as there may be a bug somewhere within our API service that's delaying the response.

Thanks. I've talked to a few other people on Reddit experiencing the same thing. nova 3 is really fast most of the time, but occasionally delayed by 10 seconds. Since the 10 second delay is not acceptable for my use case, I've moved my prod system to another provider. I'd move back if the issue gets resolved. Thanks.

srisch Jan 2, 2026
Maintainer

You're welcome! Totally get it and I do apologize for that, definitely isn't a desired experience.

We just verified internally it is indeed a bug in our API and the appropriate team is working on it. The team has marked it as a high priority fix, so i'm hoping sometime early next week.

Will keep you posted when the build goes live.

ryanshrott Jan 2, 2026
Author

You're welcome! Totally get it and I do apologize for that, definitely isn't a desired experience.

We just verified internally it is indeed a bug in our API and the appropriate team is working on it. The team has marked it as a high priority fix, so i'm hoping sometime early next week.

Will keep you posted when the build goes live.

Awesome. Thanks for keeping me posted!

srisch Jan 8, 2026
Maintainer

You're welcome - we pushed a hotfix this morning so you should see an improvement. Let me know if the behavior shows up again!

PMekerle Jan 8, 2026

Thanks for keeping us up to date. We will be testing here. 🙏

PMekerle · 2026-01-02T18:37:25Z

PMekerle
Jan 2, 2026

Same problem here. Hope there's a solution soon. Using Nova-3 Medical

2 replies

PMekerle Jan 2, 2026

for example request with id 676eba5a-967d-4eb8-a9f5-0fbd114b3322 (7s of audio took 10s to transcribe)

srisch Jan 2, 2026
Maintainer

for example request with id 676eba5a-967d-4eb8-a9f5-0fbd114b3322 (7s of audio took 10s to transcribe)

I just checked this one and my logs indicate that we responded in 600ms - and that was the 200 response code returned by our load balancer, do you possibly have a different ID I can check?

Deepgram

Intermittent 11s Latency Spikes on Nova-3 (Timeout/Retry Signature) #1512

Uh oh!

ryanshrott Dec 30, 2025

Simplified usage:

Sending small buffers (<400KB) via POST with appropriate headers

Replies: 5 comments · 16 replies

Uh oh!

deepgram-community[bot] bot Dec 30, 2025

Uh oh!

Taltzipi Jan 2, 2026

Uh oh!

deepgram-community[bot] bot Dec 30, 2025

Uh oh!

ryanshrott Dec 30, 2025 Author

Uh oh!

deepgram-community[bot] bot Dec 30, 2025

Uh oh!

ryanshrott Dec 30, 2025 Author

Uh oh!

Uh oh!

ryanshrott Dec 30, 2025 Author

Uh oh!

ryanshrott Dec 30, 2025 Author

Uh oh!

ryanshrott Dec 30, 2025 Author

Uh oh!

ryanshrott Dec 31, 2025 Author

Uh oh!

ryanshrott Jan 2, 2026 Author

Uh oh!

Uh oh!

srisch Jan 2, 2026 Maintainer

Uh oh!

ryanshrott Jan 2, 2026 Author

Uh oh!

srisch Jan 8, 2026 Maintainer

Uh oh!

PMekerle Jan 8, 2026

Uh oh!

PMekerle Jan 2, 2026

Uh oh!

PMekerle Jan 2, 2026

Uh oh!

srisch Jan 2, 2026 Maintainer

ryanshrott
Dec 30, 2025

Replies: 5 comments 16 replies

deepgram-community[bot]
bot Dec 30, 2025

deepgram-community[bot]
bot Dec 30, 2025

ryanshrott Dec 30, 2025
Author

deepgram-community[bot]
bot Dec 30, 2025

ryanshrott Dec 30, 2025
Author

ryanshrott Dec 30, 2025
Author

ryanshrott Dec 30, 2025
Author

ryanshrott Dec 30, 2025
Author

ryanshrott
Dec 31, 2025
Author

ryanshrott Jan 2, 2026
Author

srisch Jan 2, 2026
Maintainer

ryanshrott Jan 2, 2026
Author

srisch Jan 8, 2026
Maintainer

PMekerle
Jan 2, 2026

srisch Jan 2, 2026
Maintainer