Skip to content

feat: CLI Feature Parity with Electron Desktop App#988

Open
aj47 wants to merge 34 commits intomainfrom
cli-feature-parity
Open

feat: CLI Feature Parity with Electron Desktop App#988
aj47 wants to merge 34 commits intomainfrom
cli-feature-parity

Conversation

@aj47
Copy link
Owner

@aj47 aj47 commented Jan 21, 2026

Summary

This PR achieves full feature parity between the new CLI TUI client and the Electron desktop app. The CLI can now be used as a complete alternative to the desktop app for terminal users — including standalone operation with zero setup via the new embedded server mode.

Latest Changes (Cloudflare Tunnel QR Fix)

🐛 Fix: Tunnel Port Mismatch

  • Root cause: tunnel-service.ts hardcoded port 3210 from config, but the CLI embedded server runs on port 3211
  • Fix: Added setServerListeningInfo() / clearServerListeningInfo() in server.ts; getDefaultLocalUrl() now uses the actual listening port
  • Files: packages/server/src/services/tunnel-service.ts, packages/server/src/server.ts

⏱️ Improvement: Tunnel URL Wait Timeout

  • Increased timeout from 12s → 30s (cloudflared quick tunnels can take 15-20s)
  • Reduced polling interval from 2s → 1s for faster detection
  • Added status messages: "Waiting for tunnel URL..." and "Tunnel failed to start"
  • File: apps/cli/src/app.ts

📦 Feature: Cloudflared Install Option

  • When cloudflared is not installed, CLI now prompts the user to install it
  • macOS: Offers brew install cloudflared
  • Linux: Provides binary download instructions
  • Re-checks installation after install attempt and proceeds with tunnel start if successful
  • File: apps/cli/src/app.ts

What's Included (Full PR)

Server Enhancements (packages/server/)

  • OpenAI-compatible SSE streaming - Server now emits chat.completion.chunk format alongside existing progress events
  • Delta tracking for efficient content streaming
  • Backward compatible with mobile app
  • Library exports - @speakmcp/server/server and @speakmcp/server/config for programmatic use
  • Dual tsup build config: CLI entry (with shebang) + library exports (without)

CLI Client (apps/cli/)

Embedded Server Mode (NEW)

  • One command to run everything: bun run src/index.ts — no separate server needed
  • Three-phase connection logic:
    1. Explicit URL (--url) — connect to specified server, fail hard if unreachable
    2. Auto-discover — probe default ports (3210, 3211, 3212, 8080)
    3. Embedded server — start @speakmcp/server in-process as fallback
  • Auto-generates random API key for embedded mode
  • Default embedded port: 3211 (configurable via --port)
  • --no-server flag to disable embedded mode
  • --debug flag for verbose logging
  • Graceful shutdown on SIGINT/SIGTERM

Critical Fixes

  • Fixed switchProfile endpoint URL (/v1/profiles/current instead of /v1/profiles/switch)
  • Added 5 missing API methods: toggleMcpServer, getModelsForProvider, exportProfile, importProfile, updateConversation
  • Fixed tunnel-service port mismatch for Cloudflare QR code flow

Agent Progress UI

  • Iteration counter, tool call visualization with status icons, tool results display, thinking indicators

Interactive Settings View

  • Provider/model dropdowns, max iterations input, MCP server toggles, save/reset buttons
  • Remote Server settings with Cloudflare tunnel QR code
  • Cloudflared install prompt when not installed

Full Keyboard Shortcuts

  • Global: Ctrl+C, Ctrl+P, Ctrl+N, ?, Esc
  • Chat: Arrow keys, PgUp/PgDn
  • Sessions: Enter, N, D
  • Settings: Tab, Enter
  • Help overlay

Error Handling & Reconnection

  • SSE error handling, retry with exponential backoff, connection status indicator, periodic health checks

Testing

  • All 235 server package tests pass
  • CLI TypeScript typecheck passes (0 errors)
  • All 14 API endpoints verified working
  • Embedded server mode tested end-to-end
  • Cloudflare tunnel QR flow verified end-to-end (correct port, QR renders, overlay works)

Architecture

Embedded Mode (default)

CLI (OpenTUI) --in-process--> Server (Fastify) --> LLM + MCP Tools

External Mode (--url)

CLI (OpenTUI) --HTTP/SSE--> Remote Server (Fastify) --> LLM + MCP Tools

How to Test

# Option 1: Embedded mode (recommended, one command)
cd apps/cli && bun run src/index.ts

# Option 2: With debug logging
cd apps/cli && bun run src/index.ts --debug

# Option 3: External server mode (two terminals)
cd packages/server && pnpm dev
cd apps/cli && bun run src/index.ts --url http://localhost:3210 --api-key <key>

aj47 added 9 commits January 20, 2026 11:21
- Implement standalone HTTP server extracting agent capabilities from Electron app
- Add config layer (paths, defaults, env variables)
- Add core services (llm, mcp-service, conversation, profile, builtin-tools)
- Add Fastify HTTP server with OpenAI-compatible API
- Add CLI entry point with comprehensive options
- Add 228 tests with minimal mocking
- Bundle @speakmcp/shared for standalone npm install support
- Add agents.md knowledge base documentation
CLI App (apps/cli/):
- OpenTUI-based terminal interface for SpeakMCP
- Chat, sessions, settings, and tools views
- HTTP client connecting to @speakmcp/server
- Configuration via CLI flags, env vars, or auto-discover
- Requires Bun runtime for OpenTUI tree-sitter support

Bug Fix (packages/server/):
- Fix builtin tools not loading (toolCount: 0)
- Import actual builtinTools from builtin-tools.ts instead of empty stub
- Remove duplicate builtinTools addition in server.ts
- mcpService.getAvailableTools() already includes prefixed builtin tools

Documentation:
- Add apps/cli/agents.md with OpenTUI patterns and gotchas
- Update packages/server/agents.md with tool format and debugging tips
- Add streamingContent field to AgentProgressUpdate type
- Add emitOpenAIChunk helper in server.ts to emit OpenAI chunk format
- Track previous content length to emit only deltas (not full text)
- Emit both custom progress events (for mobile) and OpenAI chunks (for CLI)
- Update emit() helper in llm.ts to pass streamingContent
- Add streamingContent to key progress emissions where content is available

The server now emits dual formats during SSE streaming:
1. Custom: { type: "progress", data: AgentProgressUpdate } for mobile app
2. OpenAI: { id, object: "chat.completion.chunk", choices: [{ delta: { content } }] } for CLI

This maintains backward compatibility while enabling CLI streaming support.

Agent-Id: agent-6395c5a9-dd80-4dac-8ab7-acc0fd76970f
- Fix switchProfile to use correct endpoint /v1/profiles/current
- Add missing API methods: toggleMcpServer, getModelsForProvider,
  deleteConversation, exportProfile, importProfile
- Add progress types: AgentProgressStep, AgentProgressUpdate
- Add SSE event types: SSEEvent union with progress/done/error/chunk
- Update chatStream() to parse and yield typed SSE events
- Update chat view to handle new SSEEvent types

Agent-Id: agent-6d00a105-69ed-405f-b29e-5c498aa7d62e
- Display iteration counter (e.g., "⏳ Iteration 2/10")
- Show tool calls with name, arguments, and status icons:
  - ▶ running, ✓ complete, ❌ error
- Show tool results with success/error indicators
- Display thinking steps with 💭 icon
- Smooth transition from progress display to final message
- Handle both progress events and fallback streaming

Agent-Id: agent-f2115108-219b-4994-b463-db5c82391544
…s, MCP toggles, and save/reset

- Transform read-only settings view into interactive form
- Add SelectRenderable dropdowns for LLM provider (OpenAI, Groq, Gemini)
- Add dynamic model dropdown that fetches models from API on provider change
- Add InputRenderable for max iterations setting
- Add MCP server toggles with [Space] to enable/disable servers
- Add keyboard navigation with arrow keys for server selection
- Add [S] Save button that calls patchSettings() and toggleMcpServer()
- Add [R] Reset button to revert to original values
- Display status messages for user feedback

Agent-Id: agent-da028b8c-2a8e-48b4-a5e9-176e7da821be
…icator

- Add requestWithRetry() method with exponential backoff for resilient API calls
- Add isRetryable() to determine if errors should be retried (network, 5xx, 429)
- Add checkHealthWithState() for health checks with reconnection status callback
- Add ConnectionState type ('online' | 'reconnecting' | 'offline')
- Replace isConnected with connectionState in AppState
- Add connection status indicator in status bar (● Online, ○ Reconnecting..., ✗ Offline)
- Add periodic health check (30s interval) when idle
- Update isHealthy() to use retry logic for resilience

Phase 6: Error Handling & Reconnection

Agent-Id: agent-b1e300c2-e8e9-41c1-bb05-4cca2ed0df78
Agent-Id: agent-b1e300c2-e8e9-41c1-bb05-4cca2ed0df78
Linked-Note-Id: b61f6756-cb74-49d7-942a-3650f03caaa8
- Global shortcuts: Ctrl+C (stop), Ctrl+P (profiles), Ctrl+N (new chat), ? (help), Esc (back)
- Chat view: arrow keys and PgUp/PgDn for scrolling
- Sessions view: Enter (resume), N (new), D (delete)
- Settings view: Tab navigation, Enter to toggle/save
- Help overlay showing all available shortcuts

Agent-Id: agent-92c3b2bd-5dfd-4b9e-98a3-dcc56a6bd56a
@augmentcode
Copy link

augmentcode bot commented Jan 21, 2026

This pull request is too large for Augment to review. The PR exceeds the maximum size limit of 100000 tokens (approximately 400000 characters) for automated code review. Please consider breaking this PR into smaller, more focused changes.

aj47 and others added 20 commits February 5, 2026 09:45
- CLI now starts @speakmcp/server in-process by default (no separate server needed)
- Three-phase connection: explicit URL → auto-discover → embedded server
- Auto-generates API key and starts on port 3211
- Added --port, --no-server, --debug CLI flags
- Graceful shutdown of embedded server on exit (SIGINT/SIGTERM)
- Updated @speakmcp/server with library exports (./server, ./config)
- Updated tsup config for dual entry points (CLI with shebang, library without)
- Updated README with CLI usage instructions
- Updated PRD to reflect current state and embedded server architecture
Tier 1 (CLI-only): G-02 agent progress, G-05 TTS toggle, G-06 tool
approval toggle, G-14 conversation search, G-11 manual tool execution

Tier 2: G-07 transcript post-processing toggle

Tier 3 (server endpoints): BUG delete conversation, G-01 tool approval
workflow, G-03 conversation rename, G-04 API key config + models service,
G-09 profile export/import, G-10 profile CRUD, G-16 diagnostics

Tier 4 (server + service): G-08 model preset selection, G-15 dual model
config, G-17 MCP server management

Tier 5 (major porting): G-12 memory management, G-13 skills management

Tier 6 (advanced): G-22 OAuth flow, G-23 elicitation/sampling, G-18
message queue, G-24 agent sessions, G-19 ACP agent delegation

All 228 tests pass. No new type errors in source files.
- Fix chat input firing on every keystroke (CHANGE → ENTER event)
- Fix sessions rename input using non-existent SUBMIT event (→ ENTER)
- Fix tools arg input using non-existent SUBMIT event (→ ENTER)
- Wire up tools view key handler in app.ts (was commented out)
- Fix settings key handling: accept KeyEvent instead of string,
  remove 's' save shortcut conflict, scope space toggle properly
- Replace BoxRenderable with ScrollBoxRenderable for chat messages
  (stickyScroll + scrollBy for actual scroll support)
- Add fallback F-key escape sequences for iTerm2 compatibility
- Add Alt+1-4 as universal view-switching alternatives

All 228 tests pass. CLI typecheck clean.
- Bug #1: Fix input focus lost after view switch (defer focus to next tick)
- Bug #2: Fix ? key double-handling in chat (blur input before help overlay)
- Bug #4: Fix Esc not forwarded to sessions/tools views (forward to all views)
- Bug #5: Fix sessions search filtering (soft-refresh without network reload)
- Bug #6: Fix arrow navigation jumping 2 positions (remove selectList.focus())
- Bug #7: Fix settings rendering artifacts (add marginTop spacing)
- Bug #8: Add S/R keyboard shortcuts to settings (matching action bar hints)
- Add key.sequence fallback for / key detection in sessions search
- Add profile create/edit/delete/export/import keybindings in profile switcher
  (C/E/D/X/I keys with manual navigation)
- Fix HTTP client Content-Type header issue causing DELETE requests to fail
  (Fastify returned 400 when Content-Type: application/json sent with no body)
- Widen profile switcher box from 50 to 62 chars for full footer visibility
- Use placeholder instead of value for edit mode input pre-fill
- Expand GET /v1/settings to return all config values merged with defaults,
  filtering desktop-only keys and masking sensitive keys
- Expand PATCH /v1/settings to accept all boolean toggles, string settings,
  number settings, API keys, and Langfuse keys with validation
- Add missing defaults to getDefaultConfig() (provider, model, STT, agent)
- Expand CLI Settings type with ~50 fields matching Electron settings
- Add Agent Settings, TTS Settings, and Langfuse Observability sections
  to CLI settings view with full keyboard navigation
- Update save/load/reset to handle all new settings fields
- Refactor toggle rendering to use generic helpers with TOGGLE_LABELS
Server was using ~/Library/Application Support/speakmcp/ while Electron
uses ~/Library/Application Support/app.speakmcp/, causing profiles,
config, and conversations to be stored separately.

Now getDataDir() uses app.speakmcp as the primary path with a fallback
to the legacy speakmcp directory for backward compatibility.
Profile switcher (Ctrl+P) and help overlay (?) now use position: 'absolute'
to truly overlay the screen instead of being stacked as flex children.
Inner box heights capped to terminal rows to prevent overflow.
Multi-byte UTF-8 characters (✓, ○, ►, ─, etc.) caused garbled rendering
in OpenTUI's native Rust renderer due to byte-length vs display-width
mismatch. Replaced all Unicode with plain ASCII equivalents across all
5 TUI source files (app.ts, chat.ts, settings.ts, tools.ts, sessions.ts).
Bootstrap console.log emoji in index.ts kept as-is (pre-TUI).
- settings.ts: Use ScrollBoxRenderable for scrollable content, hide
  verticalScrollBar, add height:1 to all single-line TextRenderables,
  add scrollToFocusedField() for Tab navigation
- chat.ts: Hide scrollbar on messageContainer, add height:1 to role
  labels, tool text, streaming role, progress header/iteration/steps,
  and approval prompt/args/hint
- sessions.ts: Add height:1 to searchLabel, emptyText, renameLabel
- tools.ts: Add height:1 to noTools, serverTitle, noServerTools, toolText
Updated README with simplified CLI usage that highlights the embedded server mode as the recommended approach. Moved external server setup to advanced section.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
techfren pushed a commit to techfren/SpeakMCP that referenced this pull request Feb 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant