Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
807be11
Remove obsolete models/config.yaml and related code
oobabooga Mar 25, 2026
d6f1485
UI: Update the enable_thinking info message
oobabooga Mar 25, 2026
368f373
Fix --idle-timeout issues with encode/decode and parallel generation
oobabooga Mar 25, 2026
e154140
Rename "truncation length" to "context length" in logs
oobabooga Mar 25, 2026
4cbea02
Add ik_llama.cpp support via `--ik` flag
oobabooga Mar 26, 2026
bda9517
Fix stopping string detection for chromadb/context-1
oobabooga Mar 28, 2026
9dd04b8
Suppress EOS token at logit level for ExLlamav3 when ban_eos_token is…
oobabooga Mar 28, 2026
4979e87
Add ik_llama.cpp support via ik_llama_cpp_binaries package
oobabooga Mar 28, 2026
be6fc06
Update the custom gradio wheels
oobabooga Mar 28, 2026
0466b6e
ik_llama.cpp: Auto-enable Hadamard KV cache rotation with quantized c…
oobabooga Mar 29, 2026
6382fbe
Several small code simplifications
oobabooga Mar 31, 2026
71c1a52
API: Implement echo + logprobs for /v1/completions endpoint
oobabooga Mar 31, 2026
328534b
Update llama.cpp
oobabooga Apr 1, 2026
4073164
Fix ExLlamav3 OOM on prompt logprobs and qwen3_5_moe HF compat
oobabooga Apr 2, 2026
a32ce25
Don't pass torch_dtype to transformers, autodetect from model config
oobabooga Apr 2, 2026
c10c6e8
API: Add token ids to logprobs output
oobabooga Apr 2, 2026
ea1f8c7
API: Optimize prompt logprobs and refactor ExLlamav3 forward pass
oobabooga Apr 2, 2026
c50e17b
Add dedicated ik portable requirements files and remove macOS ik builds
oobabooga Apr 2, 2026
8f8b57a
Update exllamav3
oobabooga Apr 2, 2026
6a1f720
Update transformers
oobabooga Apr 2, 2026
468cb5c
Update accelerate
oobabooga Apr 2, 2026
80e81a5
Remove ik macOS wheels from full requirements
oobabooga Apr 2, 2026
f6f8f14
Security: Fix SSRF in superbooga extensions
oobabooga Apr 2, 2026
091037e
Fix top_logprobs_ids missing for llama.cpp loader
oobabooga Apr 2, 2026
a61bde5
Update llama.cpp
oobabooga Apr 3, 2026
d841574
Update the custom gradio wheels
oobabooga Apr 3, 2026
7aab2fd
API: Improve cache clearing in logprobs
oobabooga Apr 3, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions .github/workflows/build-everything-tgw.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,3 +68,31 @@ jobs:
with:
version: ${{ inputs.version }}
config: 'os:macos-15-intel,macos-14'

build_release_ik_cuda_windows:
name: ik CUDA Windows
uses: ./.github/workflows/build-portable-release-ik-cuda.yml
with:
version: ${{ inputs.version }}
config: 'os:windows-2022'

build_release_ik_cuda_linux:
name: ik CUDA Linux
uses: ./.github/workflows/build-portable-release-ik-cuda.yml
with:
version: ${{ inputs.version }}
config: 'os:ubuntu-22.04'

build_release_ik_cpu_windows:
name: ik CPU Windows
uses: ./.github/workflows/build-portable-release-ik.yml
with:
version: ${{ inputs.version }}
config: 'os:windows-2022'

build_release_ik_cpu_linux:
name: ik CPU Linux
uses: ./.github/workflows/build-portable-release-ik.yml
with:
version: ${{ inputs.version }}
config: 'os:ubuntu-22.04'
178 changes: 178 additions & 0 deletions .github/workflows/build-portable-release-ik-cuda.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
name: Build ik CUDA

on:
workflow_dispatch:
inputs:
version:
description: 'Version tag of text-generation-webui to build: v3.0'
default: 'v3.0'
required: true
type: string
config:
description: 'Override configurations to build: key1:item1-1,item1-2;key2:item2-1,item2-2'
default: 'Default'
required: false
type: string
exclude:
description: 'Exclude build configurations: key1-1:item1-1,key1-2:item1-2;key2-1:item2-1,key2-2:item2-2'
default: 'None'
required: false
type: string
workflow_call:
inputs:
version:
description: 'Version tag of text-generation-webui to build: v3.0'
default: 'v3.0'
required: true
type: string
config:
description: 'Configurations to build: key1:item1-1,item1-2;key2:item2-1,item2-2'
default: 'Default'
required: false
type: string
exclude:
description: 'Exclude build configurations: key1-1:item1-1,key1-2:item1-2;key2-1:item2-1,key2-2:item2-2'
default: 'None'
required: false
type: string

permissions:
contents: write

jobs:
define_matrix:
name: Define Build Matrix
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
defaults:
run:
shell: pwsh
env:
CONFIGIN: ${{ inputs.config }}
EXCLUDEIN: ${{ inputs.exclude }}

steps:
- name: Define Job Output
id: set-matrix
run: |
$matrix = @{
'os' = @('ubuntu-22.04', 'windows-2022')
'pyver' = @("3.13")
'cuda' = @("12.4", "13.1")
}

if ($env:CONFIGIN -ne 'Default') {$env:CONFIGIN.split(';').foreach({$matrix[$_.split(':')[0]] = $_.split(':')[1].split(',')})}

if ($env:EXCLUDEIN -ne 'None') {
$exclusions = @()
$exclusions += $env:EXCLUDEIN.split(';').replace(':','=').replace(',',"`n") | ConvertFrom-StringData
$matrix['exclude'] = $exclusions
}

$matrixOut = ConvertTo-Json $matrix -Compress
Write-Output ('matrix=' + $matrixOut) >> $env:GITHUB_OUTPUT

build_wheels:
name: ${{ matrix.os }} ${{ matrix.pyver }} CUDA ${{ matrix.cuda }}
needs: define_matrix
runs-on: ${{ matrix.os }}
strategy:
matrix: ${{ fromJSON(needs.define_matrix.outputs.matrix) }}
defaults:
run:
shell: pwsh
env:
PCKGVER: ${{ inputs.version }}

steps:
- uses: actions/checkout@v6
with:
repository: 'oobabooga/text-generation-webui'
ref: ${{ inputs.version }}
submodules: 'recursive'

- uses: actions/setup-python@v6
with:
python-version: ${{ matrix.pyver }}

- name: Build Package
shell: bash
run: |
VERSION_CLEAN="${{ inputs.version }}"
VERSION_CLEAN="${VERSION_CLEAN#v}"
cd ..
cp -r text-generation-webui "text-generation-webui-${VERSION_CLEAN}"
cd "text-generation-webui-${VERSION_CLEAN}"

# Remove extensions that need additional requirements
allowed=("character_bias" "gallery" "sd_api_pictures")
find extensions/ -mindepth 1 -maxdepth 1 -type d | grep -v -E "$(printf '%s|' "${allowed[@]}" | sed 's/|$//')" | xargs rm -rf

# Define common variables
CUDA_VERSION="${{ matrix.cuda }}"
VERSION="${{ inputs.version }}"

# 1. Set platform-specific variables
if [[ "$RUNNER_OS" == "Windows" ]]; then
PLATFORM="windows"
PYTHON_URL="https://github.com/astral-sh/python-build-standalone/releases/download/20260303/cpython-3.13.12+20260303-x86_64-pc-windows-msvc-install_only_stripped.tar.gz"
PIP_PATH="portable_env/python.exe -m pip"
PACKAGES_PATH="portable_env/Lib/site-packages"
rm start_linux.sh start_macos.sh
else
PLATFORM="linux"
PYTHON_URL="https://github.com/astral-sh/python-build-standalone/releases/download/20260303/cpython-3.13.12+20260303-x86_64-unknown-linux-gnu-install_only_stripped.tar.gz"
PIP_PATH="portable_env/bin/python -m pip"
PACKAGES_PATH="portable_env/lib/python3.13/site-packages"
rm start_macos.sh start_windows.bat
fi

# 2. Download and extract Python
cd ..
echo "Downloading Python for $PLATFORM..."
curl -L -o python-build.tar.gz "$PYTHON_URL"
tar -xzf python-build.tar.gz
mv python "text-generation-webui-${VERSION_CLEAN}/portable_env"

# 3. Prepare requirements file based on CUDA version
cd "text-generation-webui-${VERSION_CLEAN}"
if [[ "$CUDA_VERSION" == "13.1" ]]; then
REQ_FILE="requirements/portable/requirements_ik_cuda131.txt"
else
REQ_FILE="requirements/portable/requirements_ik.txt"
fi

# 4. Inject --ik into start scripts
sed -i 's/--portable/--portable --ik/g' start_linux.sh start_windows.bat 2>/dev/null || true

# 5. Install packages
echo "Installing Python packages from $REQ_FILE..."
$PIP_PATH install --target="./$PACKAGES_PATH" -r "$REQ_FILE"

# 6. Clean up
rm -rf .git cmd* update_wizard* Colab-TextGen-GPU.ipynb docker setup.cfg .github .gitignore requirements/ one_click.py

# 7. Create archive
cd ..
if [[ "$RUNNER_OS" == "Windows" ]]; then
ARCHIVE_NAME="textgen-portable-ik-${VERSION_CLEAN}-${PLATFORM}-cuda${CUDA_VERSION}.zip"
echo "Creating archive: $ARCHIVE_NAME"
powershell -Command "Compress-Archive -Path text-generation-webui-${VERSION_CLEAN} -DestinationPath $ARCHIVE_NAME"
else
ARCHIVE_NAME="textgen-portable-ik-${VERSION_CLEAN}-${PLATFORM}-cuda${CUDA_VERSION}.tar.gz"
echo "Creating archive: $ARCHIVE_NAME"
tar czf "$ARCHIVE_NAME" "text-generation-webui-${VERSION_CLEAN}"
fi

- name: Upload files to a GitHub release
id: upload-release
uses: svenstaro/upload-release-action@2.7.0
continue-on-error: true
with:
repo_token: ${{ secrets.GITHUB_TOKEN }}
file: ../textgen-portable-ik-*
tag: ${{ inputs.version }}
file_glob: true
make_latest: false
overwrite: true
173 changes: 173 additions & 0 deletions .github/workflows/build-portable-release-ik.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
name: Build ik CPU

on:
workflow_dispatch:
inputs:
version:
description: 'Version tag of text-generation-webui to build: v3.0'
default: 'v3.0'
required: true
type: string
config:
description: 'Override configurations to build: key1:item1-1,item1-2;key2:item2-1,item2-2'
default: 'Default'
required: false
type: string
exclude:
description: 'Exclude build configurations: key1-1:item1-1,key1-2:item1-2;key2-1:item2-1,key2-2:item2-2'
default: 'None'
required: false
type: string
workflow_call:
inputs:
version:
description: 'Version tag of text-generation-webui to build: v3.0'
default: 'v3.0'
required: true
type: string
config:
description: 'Configurations to build: key1:item1-1,item1-2;key2:item2-1,item2-2'
default: 'Default'
required: false
type: string
exclude:
description: 'Exclude build configurations: key1-1:item1-1,key1-2:item1-2;key2-1:item2-1,key2-2:item2-2'
default: 'None'
required: false
type: string

permissions:
contents: write

jobs:
define_matrix:
name: Define Build Matrix
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
defaults:
run:
shell: pwsh
env:
CONFIGIN: ${{ inputs.config }}
EXCLUDEIN: ${{ inputs.exclude }}

steps:
- name: Define Job Output
id: set-matrix
run: |
$matrix = @{
'os' = @('ubuntu-22.04', 'windows-2022')
'pyver' = @("3.13")
}

if ($env:CONFIGIN -ne 'Default') {$env:CONFIGIN.split(';').foreach({$matrix[$_.split(':')[0]] = $_.split(':')[1].split(',')})}

if ($env:EXCLUDEIN -ne 'None') {
$exclusions = @()
$exclusions += $env:EXCLUDEIN.split(';').replace(':','=').replace(',',"`n") | ConvertFrom-StringData
$matrix['exclude'] = $exclusions
}

$matrixOut = ConvertTo-Json $matrix -Compress
Write-Output ('matrix=' + $matrixOut) >> $env:GITHUB_OUTPUT

build_wheels:
name: ${{ matrix.os }} ${{ matrix.pyver }}
needs: define_matrix
runs-on: ${{ matrix.os }}
strategy:
matrix: ${{ fromJSON(needs.define_matrix.outputs.matrix) }}
defaults:
run:
shell: pwsh
env:
PCKGVER: ${{ inputs.version }}

steps:
- uses: actions/checkout@v6
with:
repository: 'oobabooga/text-generation-webui'
ref: ${{ inputs.version }}
submodules: 'recursive'

- uses: actions/setup-python@v6
with:
python-version: ${{ matrix.pyver }}

- name: Build Package
shell: bash
run: |
VERSION_CLEAN="${{ inputs.version }}"
VERSION_CLEAN="${VERSION_CLEAN#v}"
cd ..
cp -r text-generation-webui "text-generation-webui-${VERSION_CLEAN}"
cd "text-generation-webui-${VERSION_CLEAN}"

# Remove extensions that need additional requirements
allowed=("character_bias" "gallery" "sd_api_pictures")
find extensions/ -mindepth 1 -maxdepth 1 -type d | grep -v -E "$(printf '%s|' "${allowed[@]}" | sed 's/|$//')" | xargs rm -rf

# Define common variables
VERSION="${{ inputs.version }}"

# 1. Set platform-specific variables
if [[ "$RUNNER_OS" == "Windows" ]]; then
PLATFORM="windows-cpu"
PYTHON_URL="https://github.com/astral-sh/python-build-standalone/releases/download/20260303/cpython-3.13.12+20260303-x86_64-pc-windows-msvc-install_only_stripped.tar.gz"
PIP_PATH="portable_env/python.exe -m pip"
PACKAGES_PATH="portable_env/Lib/site-packages"
rm start_linux.sh start_macos.sh
else
PLATFORM="linux-cpu"
PYTHON_URL="https://github.com/astral-sh/python-build-standalone/releases/download/20260303/cpython-3.13.12+20260303-x86_64-unknown-linux-gnu-install_only_stripped.tar.gz"
PIP_PATH="portable_env/bin/python -m pip"
PACKAGES_PATH="portable_env/lib/python3.13/site-packages"
rm start_macos.sh start_windows.bat
fi

# 2. Download and extract Python
echo "Downloading Python for $PLATFORM..."
cd ..
curl -L -o python-build.tar.gz "$PYTHON_URL"
tar -xzf python-build.tar.gz
mv python "text-generation-webui-${VERSION_CLEAN}/portable_env"

# 3. Prepare requirements file
cd "text-generation-webui-${VERSION_CLEAN}"
REQ_FILE="requirements/portable/requirements_ik_cpu_only.txt"
echo "Using requirements file: $REQ_FILE"

# 4. Inject --ik into start scripts
sed -i 's/--portable/--portable --ik/g' start_linux.sh start_windows.bat 2>/dev/null || true

# 5. Install packages
echo "Installing Python packages from $REQ_FILE..."
$PIP_PATH install --target="./$PACKAGES_PATH" -r "$REQ_FILE"

# 6. Clean up
rm -rf .git cmd* update_wizard* Colab-TextGen-GPU.ipynb docker setup.cfg .github .gitignore requirements/ one_click.py

# 7. Create archive
cd ..
if [[ "$RUNNER_OS" == "Windows" ]]; then
ARCHIVE_NAME="textgen-portable-ik-${VERSION_CLEAN}-${PLATFORM}.zip"
echo "Creating archive: $ARCHIVE_NAME"
powershell -Command "Compress-Archive -Path text-generation-webui-${VERSION_CLEAN} -DestinationPath $ARCHIVE_NAME"
else
ARCHIVE_NAME="textgen-portable-ik-${VERSION_CLEAN}-${PLATFORM}.tar.gz"
echo "Creating archive: $ARCHIVE_NAME"
tar czf "$ARCHIVE_NAME" "text-generation-webui-${VERSION_CLEAN}"
fi

- name: Upload files to a GitHub release
id: upload-release
uses: svenstaro/upload-release-action@2.7.0
continue-on-error: true
with:
repo_token: ${{ secrets.GITHUB_TOKEN }}
file: ../textgen-portable-ik-*
tag: ${{ inputs.version }}
file_glob: true
make_latest: false
overwrite: true
2 changes: 1 addition & 1 deletion docs/01 - Chat Tab.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ Used for talking to an instruction-following model using the prompt format defin

The prompt format is defined by the **Instruction template** parameter in "Parameters" > "Instruction template", which represents a Jinja2 template.

Note that when you load a model in the "Model" tab, the web UI will try to automatically detect its instruction template (if any), and will update the values under "Parameters" > "Instruction template" accordingly. This is done using a set of regular expressions defined in `user_data/models/config.yaml`. This detection is not guaranteed to be accurate. You should check the model card on Hugging Face to see if you are using the correct prompt format.
Note that when you load a model in the "Model" tab, the web UI will try to automatically detect its instruction template (if any) from the model metadata (e.g. `tokenizer_config.json` or GGUF metadata), and will update the values under "Parameters" > "Instruction template" accordingly. You should check the model card on Hugging Face to see if you are using the correct prompt format.

### Chat-instruct

Expand Down
Loading
Loading