Significant degradation after just using 20% of context? #5269

yashasvigirdhar · 2025-07-31T09:27:35Z

yashasvigirdhar
Jul 31, 2025

So, I've been using gemini for a couple of weeks now.

One thing I've noticed consistently is that the performance significantly degrades after I've used around 15-20% of the context, i.e. I see 80% context left or less.

I understand there's a term for this called context rot, but I'd not imagine that to affect so early in the chat. Due to this reason, I always have to quit and start a new chat after this point. I've never reached a point where I used 50% of the available context.

To fellow users, have you also observed a similar pattern?

To Googlers working on it, have you internally tested specifically for the context rot after we've used more than 20%?

Next time it happens, I can provide a more detailed example. But its usually of the form that the model would stop honoring some o the conditions in the context file OR start going in loops OR start taking longer for tasks OR not really provide me good enough solutions. And it doesn't do any of that in a fresh chat. Its quite good when I start a fresh chat and upto the point I've used 10% of the context.

yashasvigirdhar · 2025-08-04T08:19:49Z

yashasvigirdhar
Aug 4, 2025
Author

Hello,

Any update on this yet? :)

0 replies

yashasvigirdhar · 2025-08-06T09:50:11Z

yashasvigirdhar
Aug 6, 2025
Author

Some more reports on this on reddit: https://www.reddit.com/r/GeminiCLI/comments/1mewkku/any_tips_to_avoid_brain_farts/.

0 replies

yashasvigirdhar · 2025-08-06T09:52:20Z

yashasvigirdhar
Aug 6, 2025
Author

@NTaylorMullen could you have a look?

0 replies

acoliver · 2025-08-06T10:21:13Z

acoliver
Aug 6, 2025

I know I sound like a broken record but this is exactly the nextSpeakerChecker problem.

#4651

Instead of getting rid of it they switched to flash lite. Now it is very marginally faster but goes insane.

I removed it from our downstream fork https://github.com/acoliver/llxprt-code and the performance increased and it no longer loops on larger contexts so far.

I really can't express enough what a bad solution it is sending the entire context to flash multiple times per user interaction to ask if pro should continue. Fixing this makes the tool infinitely more usable.

The other thing that helps is not making the context so huge. Tightening things like the tool calls...

4 replies

yashasvigirdhar Aug 6, 2025
Author

Does it do that everytime? even for paid api key users who initialize with 2.5 pro model?

acoliver Aug 7, 2025

0d65baf - they are siwtching it back to flash. its still terrible they should remove it but it will make it slower and less bad for now.

yashasvigirdhar Aug 7, 2025
Author

yeah just doing it looks like a big problem.

acoliver Aug 7, 2025

Yes, since I removed it from LLxprt, I haven't seen any negatives. other users report it is just faster. Obviously, removing it will reduce your token burn if paying.

acoliver · 2025-08-06T13:05:50Z

acoliver
Aug 6, 2025

Multiple times. Especially if you are using pro.

https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/utils/nextSpeakerChecker.ts

Called from
https://github.com/google-gemini/gemini-cli/blob/main/packages/core/src/core/client.ts

So send a chat, tool calls happen, next speaker checker sends the whole conversation to flash (now flash lite) and asks if pro should continue. Flash lite tells pro to continue.

Pro responds or edits something. Then nextSpeakerChecker runs again...
Pro summarizes what it did
Next speaker checker runs again - this is that long delay after the final response where it tells you it's finished and it spins for awhile... you made one request, the entire conversation got sent 4x.

In LLxprt Code we removed it and fixed some exposed asynch bugs. It works with Pro (paid and free) as well as other models. I could be mistaken but I've yet to see Pro sit there and wonder whether it should continue. It seems perfectly able without this check. The token burn is certainly lower, but more importantly my patience is tried less.

0 replies

Significant degradation after just using 20% of context? #5269

Uh oh!

yashasvigirdhar Jul 31, 2025

Replies: 5 comments · 4 replies

Uh oh!

yashasvigirdhar Aug 4, 2025 Author

Uh oh!

yashasvigirdhar Aug 6, 2025 Author

Uh oh!

yashasvigirdhar Aug 6, 2025 Author

Uh oh!

acoliver Aug 6, 2025

Uh oh!

yashasvigirdhar Aug 6, 2025 Author

Uh oh!

acoliver Aug 7, 2025

Uh oh!

yashasvigirdhar Aug 7, 2025 Author

Uh oh!

acoliver Aug 7, 2025

Uh oh!

Uh oh!

acoliver Aug 6, 2025

yashasvigirdhar
Jul 31, 2025

Replies: 5 comments 4 replies

yashasvigirdhar
Aug 4, 2025
Author

yashasvigirdhar
Aug 6, 2025
Author

yashasvigirdhar
Aug 6, 2025
Author

acoliver
Aug 6, 2025

yashasvigirdhar Aug 6, 2025
Author

yashasvigirdhar Aug 7, 2025
Author

acoliver
Aug 6, 2025