model routing appears to be very good at protecting 2.5-pro quota. What's the catch? #12056

timrichardson · 2025-10-27T04:09:32Z

timrichardson
Oct 27, 2025

A few hours ago I enabled model routing in my stable release. It seems to be an exceptionally good feature. In my current session, it shows 69 requests to 2.5-pro, 21 to 2.5-flash and 563 to 2.5-flash-lite. It is definitely protecting my quota of pro queries, and I don't notice any decline in usefulness, with a fairly large python code base.

/stats model reports 347 errors for the 563 flash-lite requests, but I see no evidence of that in my experience.

After a few hours, this looks like a transformative feature, consider that by now I would have been well out of my AI Pro quota which is I think 200 2.5 queries.

I am surprised this feature is not on by default. Or I have failed to detect some downside of it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model routing appears to be very good at protecting 2.5-pro quota. What's the catch? #12056

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

model routing appears to be very good at protecting 2.5-pro quota. What's the catch? #12056

Uh oh!

timrichardson Oct 27, 2025

Replies: 0 comments

timrichardson
Oct 27, 2025