Control the maximum response size in `IChatCompletionService` #12670

jamescrosswell · 2025-07-05T00:48:34Z

jamescrosswell
Jul 5, 2025

I'm using gemini-2.5-flash-lite-preview-06-17. According to the docs for that model, the output token limit is 65,536 (so roughly 250k words).

I'm using this model using the IChatCompletionService and I'm asking the model to provide it's response as a JSON object. However the GeminiChatMessageContent result that I'm getting back from the chat completion service appears to be getting truncated somehow. It's never truncated to an exact length - it can be anywhere from ~1010 to ~1090 characters long... but never longer than that.

I can't find anything in the docs about how to limit or configure the maximum response content size and 1000 characters simply isn't very useful.

Any idea how or why the chat responses are getting truncated... and more importantly, how I can override/change this default behaviour to something more appropriate?

EDIT

After adding some logging, I see in the GeminiChatMessageContent.MetaData for the response that the Metadata: FinishReason = MAX_TOKENS. The chat history, at this point, consists of two messages:

A message providing some basic context (about 3500 characters... so less than 1k tokens)
A message asking the model to analyse a web page and provide a summary of that web page. The request itself is less than 100 characters... I'm not sure what the implications are of asking the model to analyse the web page though (the web page contains a news article).

How can I see what the token limit is?
How can I discover which operations (or parts of the operation) are using all those tokens?

jamescrosswell · 2025-07-05T01:43:06Z

jamescrosswell
Jul 5, 2025
Author

Aha, I see it's in the PromptExecutionSettings:

https://learn.microsoft.com/en-us/dotnet/api/microsoft.semantickernel.connectors.google.geminipromptexecutionsettings.maxtokens?view=semantic-kernel-dotnet#microsoft-semantickernel-connectors-google-geminipromptexecutionsettings-maxtokens

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Control the maximum response size in `IChatCompletionService` #12670

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Control the maximum response size in IChatCompletionService #12670

Uh oh!

Uh oh!

jamescrosswell Jul 5, 2025

EDIT

Replies: 1 comment

Uh oh!

jamescrosswell Jul 5, 2025 Author

Control the maximum response size in `IChatCompletionService` #12670

jamescrosswell
Jul 5, 2025

jamescrosswell
Jul 5, 2025
Author