Bedrock Max Tokens With Large MCP Tool Responses #8489
Replies: 1 comment 2 replies
-
There are truncation strategies per tool call, meaning previous tool calls are truncated after subsequent tool calls if the limit is approached, but if you suddenly get a response over the limit, it will cause this error. In my opinion, the best thing to do is optimize the MCP server or the query being made to its tool so that it doesn't generate this many logs. For example, the MCP server could benefit from resources, to instead save this output to a file via resources spec which seems more appropriate, or to limit the amount of output with clear instruction on how to "fetch" the next page. We could introduce some lossiness by limiting the amount of output per tool call--ideally, this would be configurable per agent. At this point, we cross over to "Feature request" territory, and there is no current configuration you can make to limit the length of requests to Bedrock after using a Tool call via MCP. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi guys, and thanks for such an awesome tool!
We have a LibreChat instance configured with AWS Bedrock endpoint, and it works fine. However, when using tools via the several MCP servers we have configured, we frequently run into a problem. The error looks commonly like this, and response stops after a tool execution that returns a lot of data:
This is how it looks in the UI:

This seems to be because the Tool response is so large, the next call to Bedrock fails. Given the specifics of the error, perhaps truncation is happening of the large message, but the original User Message is lost as well. I can reproduce this every time.
When we first deployed LibreChat with these MCPs and ran into this issue, we had the model defined directly in
modelSpecs
like so - note themaxContextTokens
andmaxTokens
set lower than the limits for troubleshooting:The Bedrock endpoint is defined as follows:
I dived into the code briefly to check the truncation logic, and there seemed to be a lot of references to Agents specifically, so I moved away from defining our model in

modelSpecs
and instead defined a default Agent that I created in the GUI as follows (note the max tokens I've set is way under the maximum for this Bedrock model - in fact changing this seemed to make no difference):However, the issue persists. Is there any other configuration I can make to limit the length of requests to Bedrock after using a Tool call via MCP? Any help appreciated!
Beta Was this translation helpful? Give feedback.
All reactions