[Bug]: Agents not passing max_tokens to LiteLLM endpoint #6884

wipash · 2025-04-14T22:59:07Z

wipash
Apr 14, 2025

What happened?

I have an agent configured to use a model from my LiteLLM endpoint.

When setting the max_tokens in the agent config, this property is not included in the request to LiteLLM.

If I use OpenAI or Anthropic endpoints directly, max_tokens is correctly sent.

Version Information

ghcr.io/danny-avila/librechat-dev:latest@sha256:71f4135a646e0318a5f0b7079a4d253ba2673c64bac67caef883c4035e2f9888

Steps to Reproduce

Create a LiteLLM endpoint:

endpoints:
  custom:
    - name: LiteLLM
      apiKey: "$${LITELLM_API_KEY}"
      baseURL: "http://litellm.litellm.svc.cluster.local:4000"
      models:
        default: ["azure-gpt-4o"]
        fetch: true
      titleConvo: true
      titleModel: "azure-gpt-4o-mini"

Create an agent that uses this endpoint.

Config from DB, with a few identifying fields removed:

{
    _id: ObjectId('xxx'),
    id: 'xxx',
    name: 'App Builder',
    description: 'This agent builds little apps',
    instructions: '# React Tool Builder Instructions\n\nYou are a specialized assistant that builds small, single purpose, interactive React tools for users. Your purpose is to create functional, useful web-based artifacts based on user needs.\n\nIf you are asked to build an app that incorporates specific parts of an engineering design standard, or other application where knowledge of specific facts, equations, or calculations is critical, you must confirm these with the user before building the app.',
    provider: 'LiteLLM',
    model: 'azure-gpt-4o',
    artifacts: 'shadcnui',
    tools: [],
    tool_kwargs: [],
    author: ObjectId('xxx'),
    agent_ids: [],
    conversation_starters: [],
    projectIds: [],
    createdAt: ISODate('2025-03-18T21:31:55.294Z'),
    updatedAt: ISODate('2025-04-14T22:47:15.009Z'),
    __v: 0,
    end_after_tools: false,
    hide_sequential_outputs: false,
    avatar: {
        filepath: 'xxx',
        source: 'azure_blob'
    },
    model_parameters: {
        max_tokens: 512,
        maxContextTokens: 200000,
        maxOutputTokens: 512,
        temperature: 0.59,
        top_p: 0.8,
        thinking: false
    }
}

Use the agent, note that it does not cap the output at 512 tokens.

View request details in LiteLLM, note that temperature is included correctly but max_tokens is not:

{
  "model": "azure-gpt-4o",
  "temperature": 0.59,
  "user": "xxx",
  "stream": true,
  "messages": [
    {
      "role": "system",
      "content": "xxx"
    },
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Build me a flappy bird game"
        }
      ]
    }
  ]
}

What browsers are you seeing the problem on?

No response

Relevant log output

2025-04-14T22:46:18.506Z debug: [BaseClient] Loading history:
{
  conversationId: "5ec79f87-5ea6-4c6e-897c-b218249eb61a",
  parentMessageId: "00000000-0000-0000-0000-000000000000",
}
2025-04-14T22:46:19.178Z debug: [BaseClient] Context Count (1/2)
{
  remainingContextTokens: 179525.2,
  maxContextTokens: 179539.2,
}
2025-04-14T22:46:19.179Z debug: [BaseClient] Context Count (2/2)
{
  remainingContextTokens: 179525.2,
  maxContextTokens: 179539.2,
}
2025-04-14T22:46:19.180Z debug: [BaseClient] tokenCountMap:
{
  d30da76c-d5b1-480f-86ca-37887ae32abb: 11,
}
2025-04-14T22:46:19.181Z debug: [BaseClient]
{
  promptTokens: 14,
  remainingContextTokens: 179525.2,
  payloadSize: 1,
  maxContextTokens: 179539.2,
}
2025-04-14T22:46:19.185Z debug: [BaseClient] tokenCountMap
{
  d30da76c-d5b1-480f-86ca-37887ae32abb: 11,
}
2025-04-14T22:46:19.186Z debug: [BaseClient] userMessage
{
  messageId: "d30da76c-d5b1-480f-86ca-37887ae32abb",
  parentMessageId: "00000000-0000-0000-0000-000000000000",
  conversationId: "5ec79f87-5ea6-4c6e-897c-b218249eb61a",
  sender: "User",
  text: "Build me a flappy bird game",
  isCreatedByUser: true,
  tokenCount: 11,
}
2025-04-14T22:46:19.608Z debug: [saveConvo] api/app/clients/BaseClient.js - saveMessageToDatabase #saveConvo
2025-04-14T22:46:49.304Z debug: [saveConvo] api/app/clients/BaseClient.js - saveMessageToDatabase #saveConvo
2025-04-14T22:46:49.506Z debug: [AgentController] Request closed
2025-04-14T22:46:52.457Z debug: [spendTokens] conversationId: 5ec79f87-5ea6-4c6e-897c-b218249eb61a | Context: title | Token usage:
{
  promptTokens: undefined,
  completionTokens: undefined,
}
2025-04-14T22:46:52.458Z debug: [spendTokens] No transactions incurred against balance
2025-04-14T22:46:52.459Z debug: [spendTokens] conversationId: 5ec79f87-5ea6-4c6e-897c-b218249eb61a | Context: title | Token usage:
{
  promptTokens: undefined,
  completionTokens: undefined,
}
2025-04-14T22:46:52.459Z debug: [spendTokens] No transactions incurred against balance
2025-04-14T22:46:52.460Z debug: [saveConvo] api/server/services/Endpoints/agents/title.js
2025-04-14T22:46:52.515Z debug: [AgentController] Title generation started
2025-04-14T22:46:52.516Z debug: [AgentController] Title generation completed
2025-04-14T22:46:52.516Z debug: [AgentController] Performing cleanup
2025-04-14T22:46:52.517Z debug: [AgentController] Cleaning up abort controller
2025-04-14T22:46:52.518Z debug: [AgentController] Cleanup completed
2025-04-14T22:47:27.718Z debug: [BaseClient] Loading history:
{
  conversationId: "2f643a48-75f9-4ead-8c33-fdfe6d107eb5",
  parentMessageId: "00000000-0000-0000-0000-000000000000",
}
2025-04-14T22:47:27.805Z debug: [BaseClient] Context Count (1/2)
{
  remainingContextTokens: 179525.2,
  maxContextTokens: 179539.2,
}
2025-04-14T22:47:27.806Z debug: [BaseClient] Context Count (2/2)
{
  remainingContextTokens: 179525.2,
  maxContextTokens: 179539.2,
}
2025-04-14T22:47:27.806Z debug: [BaseClient] tokenCountMap:
{
  db51258c-2d28-4a22-ab5e-94f440e46d31: 11,
}
2025-04-14T22:47:27.807Z debug: [BaseClient]
{
  promptTokens: 14,
  remainingContextTokens: 179525.2,
  payloadSize: 1,
  maxContextTokens: 179539.2,
}
2025-04-14T22:47:27.807Z debug: [BaseClient] tokenCountMap
{
  db51258c-2d28-4a22-ab5e-94f440e46d31: 11,
}
2025-04-14T22:47:27.808Z debug: [BaseClient] userMessage
{
  messageId: "db51258c-2d28-4a22-ab5e-94f440e46d31",
  parentMessageId: "00000000-0000-0000-0000-000000000000",
  conversationId: "2f643a48-75f9-4ead-8c33-fdfe6d107eb5",
  sender: "User",
  text: "Build me a flappy bird game",
  isCreatedByUser: true,
  tokenCount: 11,
}
2025-04-14T22:47:27.911Z debug: [saveConvo] api/app/clients/BaseClient.js - saveMessageToDatabase #saveConvo
2025-04-14T22:47:40.334Z debug: [FinalizationRegistry] Cleaning up client for user 678db103cfeea53d4ee7f526
2025-04-14T22:47:49.741Z debug: [saveConvo] api/app/clients/BaseClient.js - saveMessageToDatabase #saveConvo
2025-04-14T22:47:49.880Z debug: [AgentController] Request closed
2025-04-14T22:47:52.385Z debug: [spendTokens] conversationId: 2f643a48-75f9-4ead-8c33-fdfe6d107eb5 | Context: title | Token usage:
{
  promptTokens: undefined,
  completionTokens: undefined,
}
2025-04-14T22:47:52.386Z debug: [spendTokens] No transactions incurred against balance
2025-04-14T22:47:52.387Z debug: [spendTokens] conversationId: 2f643a48-75f9-4ead-8c33-fdfe6d107eb5 | Context: title | Token usage:
{
  promptTokens: undefined,
  completionTokens: undefined,
}
2025-04-14T22:47:52.387Z debug: [spendTokens] No transactions incurred against balance
2025-04-14T22:47:52.388Z debug: [saveConvo] api/server/services/Endpoints/agents/title.js
2025-04-14T22:47:52.489Z debug: [AgentController] Title generation started
2025-04-14T22:47:52.490Z debug: [AgentController] Title generation completed
2025-04-14T22:47:52.490Z debug: [AgentController] Performing cleanup
2025-04-14T22:47:52.491Z debug: [AgentController] Cleaning up abort controller
2025-04-14T22:47:52.491Z debug: [AgentController] Cleanup completed

Screenshots

Code of Conduct

I agree to follow this project's Code of Conduct

danny-avila · 2025-04-15T02:18:03Z

danny-avila
Apr 15, 2025
Maintainer

Thanks, I confirmed this is happening and will have a fix soon. Due to a discrepancy in how the fields are interpreted in the SDK we are using.

0 replies

danny-avila · 2025-04-15T02:26:11Z

danny-avila
Apr 15, 2025
Maintainer

Closed by #6886

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: Agents not passing max_tokens to LiteLLM endpoint #6884

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

[Bug]: Agents not passing max_tokens to LiteLLM endpoint #6884

Uh oh!

Uh oh!

wipash Apr 14, 2025

What happened?

Version Information

Steps to Reproduce

What browsers are you seeing the problem on?

Relevant log output

Screenshots

Code of Conduct

Replies: 2 comments

Uh oh!

danny-avila Apr 15, 2025 Maintainer

Uh oh!

danny-avila Apr 15, 2025 Maintainer

wipash
Apr 14, 2025

danny-avila
Apr 15, 2025
Maintainer

danny-avila
Apr 15, 2025
Maintainer