Agent raise: "The maximum tokens you requested exceeds the model limit of", not from the model directly #6497

mlorthiois · 2025-03-23T16:38:06Z

mlorthiois
Mar 23, 2025

What happened?

Hello,

I have an error when I invoke a model from the Agent tab, but not from Bedrock directly.

Let say I use the user prompt: Hello, how are you?:

With Bedrock and a system prompt empty, the model responds (tested with Llama3.3-70B, nova-pro, etc).
When these same models are used via agents, without any tool or system prompt (aka in the same setup as a direct model), I get this error: The maximum tokens you requested exceeds the model limit of <limit> where limit=8192 for llama3.3-70B and limit=5120 for nova-pro.

Here is a log in api/logs:

{
  "$fault": "client",
  "$metadata": {
    "attempts": 1,
    "httpStatusCode": 400,
    "requestId": "2a1a7b8e-a48a-4cdb-8acf-4b7e51961306",
    "totalRetryDelay": 0
  },
  "level": "error",
  "message": "[handleAbortError] AI response error; aborting request: The maximum tokens you requested exceeds the model limit of 5120. Try again with a maximum tokens value that is lower than 5120.",
  "name": "ValidationException",
  "pregelTaskId": "4759adcd-5ff9-56e5-a8c6-3bfe566f9c2b",
  "stack": "ValidationException: The maximum tokens you requested exceeds the model limit of 5120. Try again with a maximum tokens value that is lower than 5120.\n    at de_ValidationExceptionRes (/app/node_modules/@aws-sdk/client-bedrock-runtime/dist-cjs/index.js:1690:21)\n    at de_CommandError (/app/node_modules/@aws-sdk/client-bedrock-runtime/dist-cjs/index.js:1507:19)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async /app/node_modules/@aws-sdk/client-bedrock-runtime/node_modules/@smithy/middleware-serde/dist-cjs/index.js:35:20\n    at async /app/node_modules/@aws-sdk/client-bedrock-runtime/node_modules/@smithy/core/dist-cjs/index.js:167:18\n    at async /app/node_modules/@aws-sdk/client-bedrock-runtime/node_modules/@smithy/middleware-retry/dist-cjs/index.js:321:38\n    at async /app/node_modules/@aws-sdk/client-bedrock-runtime/node_modules/@aws-sdk/middleware-logger/dist-cjs/index.js:33:22\n    at async ChatBedrockConverse._streamResponseChunks (/app/node_modules/@langchain/aws/dist/chat_models.cjs:695:26)\n    at async ChatBedrockConverse._generateUncached (/app/node_modules/@langchain/core/dist/language_models/chat_models.cjs:188:34)\n    at async ChatBedrockConverse.invoke (/app/node_modules/@langchain/core/dist/language_models/chat_models.cjs:65:24)"
}

I've searched Issues and Discussions but I haven't found anyone raising a similar error.
Am I making a mistake somewhere? Do you append the user prompt when used in an Agent config?

I should point out that using the agent only works for Claude Sonnet (3.5 and 3.7), even Haiku 3.5 fails.
Thanks for your help.

Version Information

ghcr.io/danny-avila/librechat v0.7.7 92d57359fc25 2 weeks ago 888MB

Steps to Reproduce

Send a question with Bedrock directly (nova-pro, Llama3.3, haiku). It works
Send the same question with the same model but with Agents: it doesn't work.

What browsers are you seeing the problem on?

No response

Relevant log output

2025-03-23T16:27:01.225Z debug: [BaseClient] Loading history:
{
  conversationId: "922a7343-124b-4fb6-a760-851bb47c3208",
  parentMessageId: "00000000-0000-0000-0000-000000000000",
}
2025-03-23T16:27:01.350Z debug: [BaseClient] Context Count (1/2)
{
  remainingContextTokens: 294987,
  maxContextTokens: 295000,
}
2025-03-23T16:27:01.351Z debug: [BaseClient] Context Count (2/2)
{
  remainingContextTokens: 294987,
  maxContextTokens: 295000,
}
2025-03-23T16:27:01.351Z debug: [BaseClient] tokenCountMap:
{
  502cf490-f41d-4808-a7a1-8801238c4e46: 10,
}
2025-03-23T16:27:01.351Z debug: [BaseClient]
{
  promptTokens: 13,
  remainingContextTokens: 294987,
  payloadSize: 1,
  maxContextTokens: 295000,
}
2025-03-23T16:27:01.351Z debug: [BaseClient] tokenCountMap
{
  502cf490-f41d-4808-a7a1-8801238c4e46: 10,
}
2025-03-23T16:27:01.351Z debug: [BaseClient] userMessage
{
  messageId: "502cf490-f41d-4808-a7a1-8801238c4e46",
  parentMessageId: "00000000-0000-0000-0000-000000000000",
  conversationId: "922a7343-124b-4fb6-a760-851bb47c3208",
  sender: "User",
  text: "Hello, how are you?",
  isCreatedByUser: true,
  tokenCount: 10,
}
2025-03-23T16:27:01.363Z debug: [saveConvo] api/app/clients/BaseClient.js - saveMessageToDatabase #saveConvo
2025-03-23T16:27:01.714Z error: [api/server/controllers/agents/client.js #sendCompletion] Operation aborted The maximum tokens you requested exceeds the model limit of 5120. Try agai... [truncated]
2025-03-23T16:27:01.714Z error: [api/server/controllers/agents/client.js #sendCompletion] Unhandled error type The maximum tokens you requested exceeds the model limit of 5120. Try a... [truncated]
2025-03-23T16:27:01.714Z error: [handleAbortError] AI response error; aborting request: The maximum tokens you requested exceeds the model limit of 5120. Try again with a maximum tok... [truncated]
2025-03-23T16:27:01.720Z debug: [AgentController] Request closed

Screenshots

Agent settings are empty:

Code of Conduct

I agree to follow this project's Code of Conduct

Answered by danny-avila

Mar 23, 2025

I'm not having this issue:

Try clicking this option and saving to ensure model parameters are not set:

View full answer

danny-avila · 2025-03-23T16:59:27Z

danny-avila
Mar 23, 2025
Maintainer

I'm not having this issue:

Try clicking this option and saving to ensure model parameters are not set:

3 replies

danny-avila Mar 23, 2025
Maintainer

Nova Pro:

I will look into making sure parameters properly reset when switching bedrock sub-providers (meta, amazon, etc.) as that may be causing the issue for you.

mlorthiois Mar 23, 2025
Author

It looks like it was the issue. By forcing a reset, I don't have this issue anymore. Thanks

danny-avila Mar 23, 2025
Maintainer

Got it, sorry for the inconvenience, still need to handle this for bedrock sub-providers and will improve this soon.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Agent raise: "The maximum tokens you requested exceeds the model limit of", not from the model directly #6497

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Agent raise: "The maximum tokens you requested exceeds the model limit of", not from the model directly #6497

Uh oh!

mlorthiois Mar 23, 2025

What happened?

Version Information

Steps to Reproduce

What browsers are you seeing the problem on?

Relevant log output

Screenshots

Code of Conduct

Replies: 1 comment · 3 replies

Uh oh!

Uh oh!

danny-avila Mar 23, 2025 Maintainer

Uh oh!

danny-avila Mar 23, 2025 Maintainer

Uh oh!

mlorthiois Mar 23, 2025 Author

Uh oh!

danny-avila Mar 23, 2025 Maintainer

mlorthiois
Mar 23, 2025

Replies: 1 comment 3 replies

danny-avila
Mar 23, 2025
Maintainer

danny-avila Mar 23, 2025
Maintainer

mlorthiois Mar 23, 2025
Author

danny-avila Mar 23, 2025
Maintainer