feat: add DEFAULT_MAX_TOKENS env variable for global token limit by AviadHayumi · Pull Request #2069 · huggingface/chat-ui

AviadHayumi · 2026-01-22T09:10:47Z

Summary

Add a new DEFAULT_MAX_TOKENS environment variable that sets a global default for max_tokens across all models, eliminating the need to configure max_tokens individually for each model in the MODELS env var.

Why

When using OpenAI-compatible backends like NVIDIA NIM, not setting max_tokens can cause issues:

NIM may use very high default values (e.g., 131072) that exceed the model's context window
This leads to errors like Input length + max new tokens > max sequence length

Before this change, users had to configure both OPENAI_BASE_URL and MODELS just to set max_tokens:

OPENAI_BASE_URL=http://nim-service.namespace.svc.cluster.local/v1
MODELS='[{"id":"meta/llama-3.1-8b-instruct","name":"Llama 3.1 8B (NIM)","parameters":{"max_tokens":4096}}]'

After this change, users can simply set:

OPENAI_BASE_URL=http://nim-service.namespace.svc.cluster.local/v1
DEFAULT_MAX_TOKENS=4096

This is much simpler - no need to duplicate the model ID/name or use the MODELS config just for setting token limits.

Changes

File	Change
`src/lib/server/config.ts`	Add `DEFAULT_MAX_TOKENS` to `ExtraConfigKeys` type
`src/lib/server/endpoints/openai/endpointOai.ts`	Use `DEFAULT_MAX_TOKENS` as fallback when model's `max_tokens` parameter is not set

Test plan

npm run check passes
Tested with NVIDIA NIM - requests now include reasonable max_tokens values
Verified that per-model parameters.max_tokens still takes priority over DEFAULT_MAX_TOKENS
Verified that omitting both uses undefined (existing behavior)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 24017e6f5a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/lib/server/endpoints/openai/endpointOai.ts

gary149 · 2026-01-26T21:00:45Z

ok

chatgpt-codex-connector bot reviewed Jan 22, 2026

View reviewed changes

src/lib/server/endpoints/openai/endpointOai.ts Outdated Show resolved Hide resolved

feat: add DEFAULT_MAX_TOKENS env variable for global token limit

0a4586e

AviadHayumi force-pushed the feat/default-max-tokens branch from 24017e6 to 0a4586e Compare January 22, 2026 09:41

gary149 closed this Jan 26, 2026

gary149 reopened this Jan 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add DEFAULT_MAX_TOKENS env variable for global token limit#2069

feat: add DEFAULT_MAX_TOKENS env variable for global token limit#2069
AviadHayumi wants to merge 1 commit intohuggingface:mainfrom
run-ai:feat/default-max-tokens

AviadHayumi commented Jan 22, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

gary149 commented Jan 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AviadHayumi commented Jan 22, 2026

Summary

Why

Changes

Test plan

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

gary149 commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gary149 commented Jan 26, 2026 •

edited

Loading