All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Bumped
tiny-agent-osfrom 1.2.9 to 1.2.11. - Added the v0.1.94 release run artifact to capture the full publication workflow.
- Skipped the live alchemy integration test when the binding is unavailable.
- Applied small
isinstance(..., X | Y)cleanups required by the current Ruff rules.
- Enforced strict config source validation and updated quality harness documentation.
- Added git safety practices documentation and updated agent guidance.
- Removed stale worktree metadata and unused state machine infrastructure.
- Cleaned up defensive logic and tightened TUI cold-start paths.
- Fixed remaining typing issues and tool argument validation.
- Corrected tinyagent imports and split agent configuration for better modularity.
- Fixed CodeRabbit agent feedback issues.
- Stabilized headless tinyagent serialization.
- Reduced TUI cold-start overhead by tightening command-registry and package import loading paths.
- Cleaned up stale project artifacts by removing outdated
.claude/and.tickets/repository files.
- Added clipboard copy support for selected UI text and extracted agent-text streaming into a dedicated UI module.
- Relaxed clipboard copy verification so successful copies no longer fail when verification is unavailable.
- Stabilized tmux system tool coverage by making discover stats and skill-load confirmations explicit and by deflaking tmux wait logic around persistent evidence.
- Hardcoded
AGENTS.mdas the guide file path and removed guide-file configurability. - Cleaned up obsolete docs and removed the unused
memory-bank/directory.
- Retried
/updateinstalls against the active Python interpreter whenuv tool upgradecannot locate the current install. - Added regression coverage for the
uv toolupdate fallback flow.
- Smoothed the model picker flow and hardened model selection state handling in the TUI.
- Stabilized tmux system startup by waiting for the editor prompt before sending end-to-end tool commands.
- Added explicit coverage for the loaded-skills panel title so the skills UI contract is tested outside tmux.
- Removed the bottom status bar from the TUI and flattened the editor border to match the panel design.
- Removed tmux system test dependence on the status bar so the end-to-end tool coverage stays stable after the UI cleanup.
- Added tmux system coverage that loads a local skill and verifies the agent uses its referenced file end-to-end.
- Consolidated skill lookup, loading, and selected-skill summary resolution onto a single registry-backed path.
- Preserved missing selected skills in UI summary surfaces while keeping prompt-building fail-loud for unresolved skill loads.
- Kept skill catalog and summary views limited to skill names and descriptions while direct selected-skill loads still inject full
SKILL.mdcontent plus absolute file paths.
- Increased the default shell tool timeout to two minutes to reduce premature command termination in normal workflows.
- Removed a flaky headless CLI system test from the release train.
- Stabilized tmux discover-tool system coverage by waiting for rendered scan stats before asserting output.
- Introduced prompt context injection for actively loaded skills so selected skill guidance is surfaced to the agent runtime.
- Refined skills UI styling and load behavior to improve local/global skill discovery ergonomics.
- Resolved selected-skill path handling to use discovered absolute paths reliably across local and global skill locations.
- Included absolute skill paths in prompt context to prevent path-resolution drift when rendering skill metadata.
- Made
/skillsautocomplete prefer the most relevant skill match instead of fuzzy-reordering short prefixes. - Isolated slash-command autocomplete to command names so
/skillsnavigation and Enter selection no longer get hijacked by hidden command suggestions.
- Improved skills loading UX and restored slash-command submit behavior.
- Introduced a first-class skills subsystem with discovery, loading, session persistence, prompt rendering, and cache-backed registry support.
- Added
/skillscatalog, search, clear, and loaded-skill management commands in the TUI.
- Exposed selected skills in the session inspector and included available/selected skill context in agent prompt construction.
- Restored compatibility with legacy markdown-only
SKILL.mdfiles that do not start with YAML frontmatter. - Fixed local-over-global precedence for case-insensitive skill name collisions.
- Prevented invalid discovered skill summaries from crashing agent creation for unrelated sessions.
- Bumped the packaged TunaCode version to
0.1.82to cut a new PyPI release.
- Migrated the live alchemy usage-contract integration test to the typed chutes flow for stricter runtime contract coverage.
- Refined tinyagent typing and compaction boundaries to tighten type-safety and reduce ambiguity in session compaction behavior.
- Removed residual
__all__shim exports to satisfy Gate 0 and enforce direct exports only. - Refreshed README screenshots to match the current UI.
- Declared
pydanticas a direct project dependency to satisfy dependency analysis and prevent missing-direct-dependency failures.
- Aligned the
bashtool timeout contract to seconds across prompt docs and runtime validation to prevent repeated millisecond-based tool failures. - Hardened
bashtimeout bounds handling by rejectingtimeout=0and added regression tests for timeout validation and prompt contract drift.
- Added a full prompt-versioning pipeline with computed hashes, mtime-aware caching, and agent-level observability hooks.
- Fixed
original_querybeing reset on every request instead of preserved across multi-turn sessions. - Hid internal tool validation errors from user-facing output.
- Fixed typing issues in the thinking panel widget and constants module.
- Removed dead NeXTSTEP UI tool renderer modules for retired tools (
glob,grep,list_dir) and cleaned renderer exports/tests to match the tinyagent-only tool surface.
- Tightened the session inspector presentation and removed a redundant compaction field to reduce TUI chrome noise.
- Tool panel CSS flow with status-based classes (running/completed/failed)
- Compaction awareness indicator in context panel
- CSS tint styling for file tool states (read/update)
- Parallel tool lifecycle coverage: capped-concurrency execution test, RequestOrchestrator parallel batch tests, and status-bar callback sequencing tests
- Enhanced
/compactcommand with error handling and user feedback - Updated read_file renderer to support new hashline format (1:ab|content)
- Improved tool panel rendering with semantic CSS classes
- Bumped
tiny-agent-osdependency to>=1.2.5and refreshed lockfile resolution - Enforced max 3 in-flight tool executions via shared tool semaphore wrapping
- Hardened RequestOrchestrator tool-start arg normalization and made batch-mode duration reporting explicit (suppressed for multi-tool batches)
- Updated status bar running-state behavior to stay coherent while multiple tools are active
- Rewrite system prompt to describe tools by purpose, not signature
- Clean up discover.py (remove section comments and inline comments)
- Normalize @mentions to absolute paths in UI
- Extracted thinking state and lifecycle management from app.py into dedicated classes
- System prompt tooling refresh
- Temporarily disabled live text streaming; default thoughts ON
- Removed rich_log alias and enforced full cutover
- Trimmed app.py docstrings to stay under 600-line hook limit
- Eliminated blank space between chat content and streaming output
- Dead code: unused openai_response_validation.py
- Added orphan module detection to CI and removed dead code from agent_helpers
/updatecommand now works for global installs (pipx,uv tool). Version check uses PyPI JSON API instead of shelling out topip. Upgrade detects tool-managed venvs and runs the correct upgrade command.
discovertool: unified code discovery replacing manual glob/grep/read chains with a single natural-language query
- Replaced the synthetic
discoverbenchmark with a real-repo harness intests/benchmarks/bench_discover.pycomparing single-calldiscoveragainst the legacylist_dir -> glob -> grep -> read_filechain - Benchmark output now reports end-to-end metrics that matter for agent workflows: cold/warm latency (including p50/p95), tool-call count, output token footprint, file/symbol recall, and actionability
- Current baseline run on TunaCode shows
discoverat 1.0 average tool calls vs 6.0 for legacy (6x fewer round trips), ~4.2x fewer output tokens (~3.1k vs ~12.9k), ~9% lower latency (cold: 2587ms vs 2811ms, warm: 2597ms vs 2832ms), and higher retrieval quality/actionability
globtool (replaced bydiscover)greptool andgrep_components/module (replaced bydiscover)list_dirtool (replaced bydiscover+bash)
- API key UX: environment variable fallback, inline entry screen, and better error surfacing
- Inline API key entry screen for model-selection flow when provider credentials are missing
- Unit coverage for API key entry behavior and authentication error rendering
- Model command now reuses a named provider/model delimiter constant for provider parsing
- MiniMax provider endpoints normalized to
/v1contract paths in integration/unit coverage
- API key save flow now raises user-facing errors for filesystem write failures without swallowing unexpected exceptions
- Headless-mode background task handling now shields request task execution and avoids timeout-driven cancellation races
- First-class MiniMax alchemy routing for coding-plan and default execution paths
- MiniMax provider contract coverage in integration and unit tests
- Updated provider defaults and models registry to expose MiniMax API-key contract entries
- Pinned
tiny-agent-osdependency to>=1.2.1to support MiniMax provider routing
- Thinking content streaming for extended thinking models (Claude 3.5+ with thinking enabled)
/thoughtscommand to toggle thinking panel visibility- Thinking panel renderer with truncation and collapsible display
- Tests for thinking stream routing, response extraction, and panel rendering
- Updated bundled models registry from models.dev (92 providers)
- Slopgotchi pet widget in context inspector with click interaction and ASCII art cycling
- Context panel with model/token/cost summary and edit tracking
- Codebase structure tree documentation
/cancelcommand for request cancellation/exitcommand and refresh UI command docs/tests- Dead-code CI checks (
unimport,vulture) in lint workflow - Unit test for context panel summary
- Renamed tamagochi module to slopgotchi
- Side-by-side diff view uses explicit Before/After captions and visible change lane
- Context rail border title set to "Session Inspector" with theme-colored accent
- Reduced core agents main module to 600 lines via helpers extraction
- Updated README interface section and clarified tool execution
- Dead tools parsing module (
src/tunacode/tools/parsing/)
- Pre-commit file-length hook performance
- Chat container tuple write issue
- Allow manual compaction below threshold
- Keep streaming updates and remove tint crash path
- Updated theme architecture handoff and CSS architecture documentation
- Live integration test for tinyagent alchemy usage contract (
tests/integration/core/test_tinyagent_alchemy_usage_contract_live.py) - Dedicated debug usage/resource lifecycle tracing module (
src/tunacode/core/debug/usage_trace.py) - Unit coverage for strict session usage schema loading and resource bar session cost propagation
- Research map documenting TunaCode Rust-only tinyagent migration path (
.claude/metadata/research/2026-02-12-tinyagent-rust-only-migration-map.md)
- Migrated TunaCode agent runtime stream path to tinyagent Rust alchemy stream (
stream_alchemy_openai_completions) - Migrated compaction summary generation stream path to tinyagent Rust alchemy stream
- Updated usage model to strict canonical tinyagent contract (
input/output/cache_read/cache_write/total_tokens/cost) - Bumped
tiny-agent-osto1.1.5
- Enforced fail-loud behavior when assistant usage payload is missing or violates canonical contract
- Resource bar session cost now reads canonical
session_total_usage.cost.total
- Core Agent: Migrated from pydantic-ai to tinyagent as the core agent loop
- Text Selection: Added Rich-renderable mouse selection support in chat panels
- Visual Styling: SelectableRichVisual for text selection in Rich renderables
- CSS-Based Theming: Textual CSS styling system with 5 stylesheet files (panels, theme-nextstep, layout, widgets, modals)
- NeXTSTEP Theme: 3D bevel borders with light top/left and dark bottom/right for raised effect
- Context Management: Context compaction system with overflow retry capabilities
- Token Tracking: OpenRouter token usage tracking for streaming responses
- Models Registry: Expanded models_registry.json with full provider catalog and API URL routing
- Provider Routing: Native TinyAgent
OpenRouterModel(base_url)for provider API URL routing
- BREAKING: Agent session persistence uses dict messages only - existing sessions may not load correctly
- BREAKING: Tool execution is now sequential (was parallel via custom orchestrator)
- Replaced Rich Panel wrappers with CSS-based PanelMeta pattern
- Agent creation now constructs tinyagent.Agent with AgentOptions
- Revamped theme architecture with CSS variable-based theming system
- Updated layout, panels, and widget styles for improved visual consistency
- Bumped tiny-agent-os to v1.1.3 (includes Rust bindings)
- Orchestrator and tool dispatcher components (tinyagent handles tool execution)
- Streaming components (pydantic-ai specific)
- Tool executor (tinyagent owns tool execution)
- pydantic-ai dependency (replaced with tiny-agent-os)
- Restored status bar content row with top bevel
- End-to-end tests for mtime-driven caches
- Typed cache accessors for agents, context, and ignore manager
- Strict cache infrastructure with strategies
- Migrated remaining lru_cache caches into CacheManager layer
- Refactored agent_config to use typed cache accessors
- Cache accessor now used for ignore manager; removed global ignore cache
- Reduced McCabe complexity to ≤10 for 14 functions
- Reduced cognitive complexity of 13 functions to under 10
- Re-enabled Ruff mccabe complexity check (max 10)
- Satisfied pre-commit dead code checks