fix: disable reasoning during bootstrap to prevent empty responses by yasinBursali · Pull Request #791 · Light-Heart-Labs/DreamServer

yasinBursali · 2026-04-04T12:13:50Z

What

Disable llama-server reasoning/thinking mode during bootstrap so the Qwen3.5-2B bootstrap model produces visible responses.

Why

Qwen3.5-2B is a "thinking" model that allocates all token budget to internal reasoning_content before generating visible content. At default token limits (50-1000 tokens), the model produces completely empty visible responses — the user sees nothing during the entire bootstrap period. This breaks the first-run experience.

How

llama-server (build b8248) supports --reasoning off and the LLAMA_ARG_REASONING environment variable. The fix:

docker-compose.base.yml — Pass LLAMA_ARG_REASONING=${LLAMA_REASONING:-auto} to the llama-server container
installers/phases/11-services.sh — Set LLAMA_REASONING=off in .env when bootstrap is active
scripts/bootstrap-upgrade.sh — Remove LLAMA_REASONING=off from .env before restarting with the full model (restores auto default)
installers/macos/install-macos.sh — Set LLAMA_REASONING=off in .env and as shell variable during bootstrap; add --reasoning flag to native llama-server launch
.env.schema.json — Register LLAMA_REASONING

Lifecycle: off during bootstrap → removed on upgrade → auto default resumes.

Testing

shellcheck on all modified shell files — clean (no new warnings)
.env.schema.json JSON validation — valid
docker-compose.base.yml YAML validation — valid
No secrets, no unrelated changes

Manual test steps:

Fresh install with bootstrap → verify LLAMA_REASONING=off in .env
Send chat message during bootstrap → verify visible response (not empty)
After background upgrade completes → verify LLAMA_REASONING removed from .env
Non-bootstrap install → verify LLAMA_REASONING absent, auto default used

Review

Critique Guardian: APPROVED

Note: AMD/Lemonade-server likely ignores LLAMA_ARG_REASONING env var — harmless no-op

Platform Impact

Linux/WSL2 (Docker): LLAMA_ARG_REASONING env var passed through compose — works
macOS (native Metal): --reasoning CLI flag + shell variable + .env persistence — works
Windows/WSL2: Same Docker compose path as Linux — works

🤖 Generated with Claude Code

Qwen3.5-2B (bootstrap model) is a thinking model that allocates all tokens to reasoning_content before generating visible output. At default token limits, users see empty responses during bootstrap. Set LLAMA_REASONING=off in .env during bootstrap, passed to llama-server via LLAMA_ARG_REASONING env var (Docker) or --reasoning flag (macOS native). Removed by bootstrap-upgrade.sh when full model loads, restoring auto default. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Lightheartdevs

Audit: REQUEST CHANGES — macOS bug

The lifecycle is well-designed for Docker (Linux/WSL), but there's a bug on macOS:

Bug: macOS native llama-server is never restarted after upgrade.
bootstrap-upgrade.sh only handles Docker containers (docker compose stop/up). On macOS, llama-server runs as a native process. After the background model download completes:

.env gets cleaned up correctly (LLAMA_REASONING removed) ✓
But the running native process keeps --reasoning off AND the old bootstrap model indefinitely ✗
No code path exists in bootstrap-upgrade.sh to restart the native process

Users on Apple Silicon will be stuck with a degraded bootstrap model with reasoning disabled after upgrade, until manual restart.

Fix options:

Add native process restart logic to bootstrap-upgrade.sh (detect PID, kill, relaunch with new model and --reasoning auto)
At minimum, print a user-facing message: "Restart llama-server to complete the upgrade"

Other findings (non-blocking):

AMD/Lemonade backend ignores LLAMA_ARG_REASONING entirely — bootstrap fix has no effect on AMD installs. Worth documenting.
.env.example not updated with LLAMA_REASONING — minor inconsistency with other llama-server params documented there
mv vs cat > file && rm inconsistency in bootstrap-upgrade.sh — mv replaces the inode and could change file ownership/permissions. Existing code in the same file uses the cat pattern. Recommend matching it.
Commented-out lines (# LLAMA_REASONING=...) are correctly preserved by the awk patterns ✓
docker-compose.override.yml interaction is correct (user override wins) ✓
${LLAMA_REASONING:-auto} handles absent/empty/set cases correctly ✓

…n for .env Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

yasinBursali · 2026-04-05T19:35:09Z

Addressing review feedback

Bug (macOS native llama-server never restarted) — Fixed:

Added elif [[ -f "$INSTALL_DIR/data/.llama-server.pid" ]] block in bootstrap-upgrade.sh after the Docker restart section

Detects macOS native llama-server via PID file and prints restart notice:

Native llama-server detected (macOS Metal mode).
NOTICE: Restart llama-server to load the new model and re-enable reasoning.
Run: ./dream-macos.sh restart

Chose notice over automatic restart to avoid PID reuse risks and rollback complexity in this scope

Non-blocking: mv → cat pattern — Fixed:

Replaced mv "${ENV_FILE}.tmp" "$ENV_FILE" with cat "${ENV_FILE}.tmp" > "$ENV_FILE" && rm -f "${ENV_FILE}.tmp"
Preserves inode and file ownership/permissions
Matches the pattern used on lines 87, 92, 97, 101 of the same file

Lightheartdevs · 2026-04-06T13:01:26Z

Closing — superseded by #795 which merged with full macOS auto-restart + rollback logic. This PR only printed a restart notice; #795 does the actual hot-swap with old-model rollback on failure, PID verification, and stale status handling.

yasinBursali mentioned this pull request Apr 4, 2026

[WSL2] Bootstrap: Qwen3.5-9B chat returns empty content after model hot-swap — reasoning consumes entire token budget yasinBursali/DreamServer#250

Closed

Lightheartdevs requested changes Apr 4, 2026

View reviewed changes

fix: add macOS restart notice in bootstrap-upgrade and use cat patter…

56940e5

…n for .env Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

yasinBursali mentioned this pull request Apr 5, 2026

dream-macos.sh: start_native_llama() should honor LLAMA_REASONING and other .env vars yasinBursali/DreamServer#252

Closed

Lightheartdevs mentioned this pull request Apr 6, 2026

feat(bootstrap): add download progress tracking and macOS model hot-swap #795

Merged

9 tasks

Lightheartdevs closed this Apr 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: disable reasoning during bootstrap to prevent empty responses#791

fix: disable reasoning during bootstrap to prevent empty responses#791
yasinBursali wants to merge 2 commits intoLight-Heart-Labs:mainfrom
yasinBursali:fix/bootstrap-disable-reasoning

yasinBursali commented Apr 4, 2026

Uh oh!

Lightheartdevs left a comment

Uh oh!

yasinBursali commented Apr 5, 2026

Uh oh!

Lightheartdevs commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yasinBursali commented Apr 4, 2026

What

Why

How

Testing

Review

Platform Impact

Uh oh!

Lightheartdevs left a comment

Choose a reason for hiding this comment

Uh oh!

yasinBursali commented Apr 5, 2026

Addressing review feedback

Uh oh!

Lightheartdevs commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants