Status: Active | Updated: 2026-02-15 | Owner:
High-level roadmap connecting vision to execution. Detailed implementation plans live in .agents/plans/.
plan.md ← What & why (this file)
AGENTS.md ← Rules & permissions
.agents/plans/ ← How (implementation details)
.agents/context/ ← Reference (architecture, domain knowledge)
.agents/research/ ← Research, evaluations, analyses
.agents/design/ ← Design references & UI research
.agents/workflows/ ← Process (dev cycle, debugging)
.agents/skills/ ← Guides (multi-step tasks)
Sources:
.agents/plans/todo-fixes-and-features.md,.agents/research/open-webui-analysis/SUMMARY.md,.agents/research/webclaw/06-recommendations.md
-
#1 Bug fixes (
In progress— 4/6 complete, 1 needs QA) Sources:plans/base-ui-pattern-fixes.md,plans/provider-neutral-replay-compiler.md- List unresolved regressions from Base UI migration and map each to owner/file.
- Fix text loading animation shift and verify no layout jump in streaming states. ✅
TextShimmerLoaderwith proper gap handling - Fix stop-streaming button behavior and ensure stream cancellation is immediate. ✅
ChatInputstop handler + AI SDK integration - Prevent streaming from continuing after exit/navigation. ✅
useEffectcleanup + 120s timeout guard inuse-chat-core.ts - Fix model persistence so selected model is restored reliably. ✅
useModelhook with DB/context persistence - Remove link text formatting artifacts (
()rendering issues).⚠️ LinkMarkdowncomponent exists, needs QA verification
-
#2 Tool calling infra (
Done)
Sources:plans/tool-calling-infrastructure.md,plans/tool-calling-hardening.md,research/multi-tool-calling-system-design.md,research/tool-calling-infrastructure.md- Finalize 3-layer hybrid architecture (provider, third-party, MCP).
- Implement and validate dual-gate injection logic.
- Document hardening outcomes and known limitations.
-
#3 Prompt-kit integration (
Not started— audit complete, implementation blocked) Sources:plans/prompt-kit-component-audit.md- Audit Prompt-kit components and map them to current chat UI surfaces. ✅ Comprehensive 256-line audit complete
- Implement highest-priority component swaps with visual parity checks. (Priority:
ThinkingBar,SystemMessage) - Validate keyboard/ARIA behavior and interaction regressions.
- Update docs/examples for new component usage.
-
#4 Visible failure feedback (
In progress— tool status done, RAG missing) Sources:research/open-webui-analysis/SUMMARY.md- Define a shared status model for tool calls, RAG, and truncation events.
⚠️ Tool status exists (ToolPartwith 4 states), RAG model missing - Add inline/badge indicators for active, partial, failed, and recovered states. ✅ Tool badges complete, ❌ RAG indicators missing
- Add user-facing error copy with actionable next steps.
⚠️ Length limit has copy, tool errors show raw text, RAG has none - Verify behavior across streaming + retry flows. ❌ Streaming error transitions not implemented
- Define a shared status model for tool calls, RAG, and truncation events.
-
#5 Security headers (
Not started— templates exist in docs) Sources:research/open-webui-analysis/SUMMARY.md,.agents/context/deployment.md,.agents/skills/base-ui-csp-provider/- Add baseline CSP with least-privilege defaults and required exceptions. (Reference:
.agents/context/deployment.md) - Add HSTS (with rollout-safe max-age strategy).
- Add
X-Frame-Optionsand validate embed behavior expectations. - Validate headers in local, preview, and production deployments.
- Add baseline CSP with least-privilege defaults and required exceptions. (Reference:
-
#6 Multi-chat card in single conversations (
Identified— root cause confirmed) Sources:app/components/chat/chat-container.tsx,app/components/multi-chat/multi-conversation.tsx- Reproduce by opening prior single-model thread while multi-chat mode is selected. ✅ Bug confirmed present
- Identify state source causing incorrect card-view rendering. ✅
multiModelEnabledpreference flag +groupResponsesstate mismatch - Fix mode/thread reconciliation logic. (Fix location:
chat-container.tsx:11-15,multi-conversation.tsx:131-149) - Add regression test for single-thread render fallback.
-
#7 Missing share button on multi-chat threads (
Identified— gated behind mode check) Sources:app/components/layout/header.tsx,app/components/layout/dialog-publish.tsx- Trace share-button visibility conditions for single vs multi-chat routes. ✅ Gated:
{!isMultiModelEnabled && <DialogPublish />}atheader.tsx:59 - Implement share action + UI state for multi-chat threads. (Underlying
DialogPublishis functional, just needs gate removed) - Verify permissions, URL generation, and copied-link behavior. (
/share/${chatId}format works for both modes) - Add regression test for multi-chat share affordance.
- Trace share-button visibility conditions for single vs multi-chat routes. ✅ Gated:
-
#8 Edit/resend message bug (
Identified— messages disappear on edit) Sources: —- Investigate: editing and resending messages causes an error; sent messages disappear after resend.
- Identify root cause in message edit/resend flow (
use-chat-core.ts,use-chat-edit.ts). - Fix and verify messages persist correctly after edit/resend across providers.
- Add regression test for edit/resend message lifecycle.
-
#9 OS-conditional shortcut labels in tooltips (
Not started— needs verification) Sources: Sidebar shortcut update discussion (2026-02-18)- Add platform detection utility (
MacvsWindows/Linux) for shortcut rendering. (Checklib/utilsor create new) - Replace hardcoded shortcut labels in sidebar toggle tooltip.
- Roll out shared shortcut label component for other tooltips.
- Add unit test coverage for platform-specific output.
- Add platform detection utility (
-
Fix create project chats (
Not started)
Sources: —- Investigate and fix project chat creation flow.
-
#10 UX redesign: tools/thinking (
Not started)
Sources:plans/prompt-kit-component-audit.md,research/competitive-feature-analysis.md- Document UX target states (loading, running tool, tool complete, thinking).
- Implement revised layout and hierarchy to match competitive patterns.
- Validate mobile + desktop readability and collapse behavior.
-
#9 UX redesign: sources (
Not started)
Sources:research/competitive-feature-analysis.md- Define source card schema (title, host, snippet, confidence/relevance).
- Redesign source presentation with better scanability and click targets.
- Add truncation/expand interactions and keyboard support.
-
#10 Settings page (
Not started)
Sources:research/competitive-feature-analysis.md- Define route architecture for full-page settings.
- Migrate existing modal sections into page tabs/sections.
- Preserve deep links and back-navigation behavior.
- Remove or deprecate legacy modal entry points.
-
#11 Model selector simplification (
Not started)
Sources:research/competitive-feature-analysis.md- Reduce selector cognitive load (grouping, defaults, fewer nested popovers).
- Surface thinking effort controls directly in selector flow.
- Validate persistence and cross-session recall of user choices.
-
#12 Inline message controls (
Not started)
Sources:research/competitive-feature-analysis.md- Add per-message model override control.
- Add per-message web-search toggle.
- Add attach/remove file controls scoped to message composition.
- Confirm compatibility with retry/edit flows.
-
#13 Message rating/feedback (
In progress)
Sources:research/competitive-feature-analysis.md,research/open-webui-analysis/SUMMARY.md- Finalize thumbs up/down interaction and active states.
- Persist feedback to backend with message/thread linkage.
- Add analytics/reporting hooks for feedback trends.
-
#14 Conversation export (
Not started)
Sources:research/competitive-feature-analysis.md,research/open-webui-analysis/SUMMARY.md- Define export schema for Markdown and JSON formats.
- Implement export action in thread UI.
- Validate exports with long threads and tool results.
-
#15 Template variables in prompts (
Not started)
Sources:research/open-webui-analysis/SUMMARY.md- Define supported variable list and escaping behavior.
- Implement template substitution pipeline before model dispatch.
- Add preview/validation UI for unresolved variables.
-
#16 Task model separation (
Not started)
Sources:research/open-webui-analysis/SUMMARY.md- Route title/tag generation to lower-cost model.
- Add fallback and timeout behavior.
- Track cost/latency impact against baseline.
-
#17 Structured audit logging (
In progress)
Sources:research/open-webui-analysis/SUMMARY.md- Define log event taxonomy and severity levels.
- Implement Convex wrappers with consistent metadata shape.
- Add configurable log level controls per environment.
-
#18 RAG pipeline (
Not started)
Sources:research/open-webui-analysis/SUMMARY.md,context/database.md- Implement chunking strategy and metadata schema.
- Add embedding generation pipeline via selected API provider.
- Integrate Convex vector search retrieval into chat flow.
- Evaluate retrieval quality with representative datasets.
-
#43 Motion preference respect (
Not started)
Sources: ChatGPT prompt input research (design/chatgpt-reference/prompt-input/), MDN prefers-reduced-motion- Audit all Motion/Framer animations and add
useReducedMotion()guards. (Files:app/components/chat-input/file-list.tsx,app/components/chat-input/suggestions.tsx,components/motion-primitives/) - Add
motion-safe:Tailwind prefix to CSS transitions. (Files:components/ui/prompt-input.tsx,app/components/chat-input/chat-input.tsx) - Verify reduced-motion behavior across composer, suggestions, file list, and streaming animations.
- Test with macOS "Reduce motion" accessibility setting and
prefers-reduced-motion: reduceemulation in DevTools.
- Audit all Motion/Framer animations and add
-
#47 Payclaw architecture hardening (beyond observability) (
Not started)
Sources:app/api/chat/route.ts,lib/tools/platform.ts,app/api/payclaw/status/route.ts,convex/schema.ts,convex/users.ts,convex/toolCallLog.ts,lib/encryption.ts,lib/user-keys.ts,lib/payclaw/client.ts- Job ownership/authz boundary: Add a Payclaw job ownership ledger (
payclawJobstable) and enforce authorization on allpay_statusreads so users cannot access jobs they did not create. - Card ID at-rest hardening: Remove plaintext
payClawCardIdstorage path; encrypt card IDs at rest using existinglib/encryption.tsprimitives with a migration-safe fallback. - Status path consolidation: Designate
pay_status(tool call) as the canonical status path; deprecate/api/payclaw/statusHTTP route with deprecation headers/logging and confirm no active runtime dependents. - Execution sequencing: Ownership ledger first → authz enforcement on reads → card ID storage hardening → status path consolidation. Each stage independently shippable.
- Job ownership/authz boundary: Add a Payclaw job ownership ledger (
-
#19 Image generation in chat (
Not started)
Sources:plans/phase-7-future-tool-integrations.md,research/open-webui-analysis/SUMMARY.md- Define provider abstraction for image models (DALL-E, Gemini) under BYOK.
- Build prompt-to-image request flow and response rendering UI.
- Add content policy checks and error handling.
-
#20 Inline triggers (
#,/,@) (Not started)
Sources:research/competitive-feature-analysis.md,research/open-webui-analysis/SUMMARY.md- Implement trigger parser and suggestion engine in composer.
- Add keyboard navigation and insertion UX.
- Integrate file/model/tool resolution for each trigger type.
-
#21 Cross-conversation memory (
Not started)
Sources:plans/cross-conversation-memory.md,research/open-webui-analysis/SUMMARY.md- Define memory schema (fact/entity/source/confidence/updatedAt).
- Implement model-callable memory tools (search/add/replace).
- Add user controls for memory visibility and deletion.
-
#22 Personalization (
Not started)
Sources:research/competitive-feature-analysis.md,research/open-webui-analysis/SUMMARY.md- Add custom instruction and rule management UI.
- Add theme controls (fonts/colors) with persisted preferences.
- Add model preset/persona creation and assignment flows.
-
#23 Code execution sandbox (
Not started)
Sources:plans/phase-7-future-tool-integrations.md,research/open-webui-analysis/SUMMARY.md- Evaluate E2B vs WebContainers for security, latency, and cost.
- Implement sandbox lifecycle (start/run/terminate) abstraction.
- Add output streaming + timeout/failure handling UX.
-
#24 Audio STT/TTS (
Not started)
Sources:plans/phase-7-future-tool-integrations.md,research/open-webui-analysis/SUMMARY.md- Define provider adapter interfaces for STT and TTS.
- Implement microphone capture and transcription flow.
- Implement response speech playback with provider selection.
-
#25 Extend replay model (
Done)
Sources:plans/provider-neutral-replay-compiler.md,plans/provider-aware-history-adaptation.md,research/provider-aware-history-adaptation.md- Preserve provider-native payloads per origin in replay model.
- Validate compatibility with provider-aware history adaptation.
- Evaluate optional expansion scope for additional payload classes.
-
#26 Inline
styleoverride safety (Not started)
Sources: Identified during dropdown-menu transition fix- Audit affected components:
select,dialog,alert-dialog,sheet,dropdown-menu. - Merge internal transition style with consumer
styleprops in all components. - Add regression tests ensuring animation survives style overrides.
- Audit affected components:
-
#27 Multi-chat layout redesign (
Not started)
Sources: —- Implement 2-chat 50/50 split behavior.
- Implement 3+ chat horizontal scrolling layout.
- Validate responsive breakpoints and minimum card widths.
-
#28 Evaluate unused SDK features (
Not started)
Sources:research/skill-gap-analysis-2026-02.md(Section 2: SDK Features NaW Doesn't Use Yet)- Evaluate middleware (
wrapLanguageModel) for logging, caching, guardrails, RAG augmentation. - Evaluate provider registry (
createProviderRegistry) as potential replacement for customopenproviders. - Evaluate telemetry (
experimental_telemetry) for OpenTelemetry-based observability. - Evaluate message metadata, data parts, and resume stream for production resilience.
- Prioritize adoption candidates and create implementation tickets.
- Evaluate middleware (
-
#29 ChatGPT-like notifications and reminders (
Not started)
Sources: —- Research product requirements and expected reminder/notification flows.
- Define reminder data model and scheduling mechanism.
- Prototype notification delivery UX and permission handling.
-
#44 Rich-text composer (ProseMirror/TipTap) (
Not started)
Sources: ChatGPT prompt input research (design/chatgpt-reference/prompt-input/), TipTap, ProseMirror, Lexical- Evaluate TipTap vs Lexical vs raw ProseMirror for bundle size, React 19 compat, and extension ecosystem.
- Prototype replacement of
PromptInputTextarea(components/ui/prompt-input.tsx:117-166) with rich-text editor. - Preserve existing auto-resize, keyboard handling (
chat-input.tsx:133-149), paste behavior (chat-input.tsx:151-188), andPromptInputcontext API (components/ui/prompt-input.tsx:35-55). - Migrate
ChatInputintegration (app/components/chat-input/chat-input.tsx) to use new editor surface. - Add extension points for
#,/,@inline triggers (connects to #20 Inline triggers). - Validate accessibility (keyboard navigation, screen reader, ARIA) and mobile virtual keyboard behavior.
- Benchmark bundle size delta and first-input latency vs current
<textarea>baseline.
-
#30 Model access control ACLs (
Not started)
Sources:research/open-webui-analysis/SUMMARY.md- Define ACL policy model (role, org, workspace, model scope).
- Enforce ACL checks in model selection and API dispatch.
- Add admin UI for policy management and audits.
-
#31 Admin-mutable config (PersistentConfig via Convex) (
Not started)
Sources:research/open-webui-analysis/SUMMARY.md- Define typed config schema with validation and defaults.
- Implement Convex-backed CRUD with auth/ownership controls.
- Add admin controls for live config updates.
-
#32 OpenTelemetry integration (
Not started)
Sources:research/open-webui-analysis/SUMMARY.md- Define telemetry coverage map (API, model calls, UI events).
- Integrate tracing/metrics exporters for target environments.
- Add sampling + privacy controls for sensitive payloads.
-
#33 Prompt library with
/trigger (Not started)
Sources:research/competitive-feature-analysis.md,research/open-webui-analysis/SUMMARY.md- Design prompt template storage schema and sharing model.
- Integrate
/trigger command palette for prompt insertion. - Add CRUD UI and usage analytics for prompt library entries.
-
#34 Flowglad integration (
Not started)
Sources:AGENTS.md- Define billing entities and subscription lifecycle mapping.
- Implement checkout, upgrade/downgrade, and cancellation flows.
- Add webhook handling for plan state synchronization.
-
Audio experience roadmap (music + SFX) (
Not started)
Sources:research/open-webui-analysis/SUMMARY.md,research/competitive-feature-analysis.md- Continue investigating performant approaches for in-app music and sound effects.
- Define event taxonomy (send, complete, error, tool-state) and mapping to audio cues.
- Define caching/loading strategy (lazy load, prewarm, cache headers, fallback formats).
- Define user controls (mute, per-channel volume, accessibility-safe defaults).
-
Character animation exploration with Rive + PixelLab (
Not started)
Sources:https://rive.app/,https://www.pixellab.ai/- Explore
rive.appfor interactive character animation workflows suitable for chat UX. - Investigate
pixellab.aifor AI-assisted character/sprite animation capabilities and fit for product direction. - Evaluate runtime performance, bundle impact, and authoring complexity.
- Identify one pilot surface (e.g., loading/thinking state) for a prototype.
- Explore
Sources:
.agents/research/webclaw/06-recommendations.md(Sections 1, 2, 4)
-
#35 Message memoization (
Done)
Benefit: Eliminate long-thread streaming jank by restricting re-renders to active message.
Sources:research/webclaw/06-recommendations.mdR01- Add
React.memoguard and content-based equality comparator. - Validate rendering complexity change (O(N) -> O(1) per chunk).
- Confirm no regression in edited/retried messages.
- Add
-
#36 Composer ref isolation (
Done)
Benefit: Stop per-keystroke cascading re-renders through chat tree.
Sources:research/webclaw/06-recommendations.mdR02- Move input state tracking to
useRef. - Add 500ms draft persistence debounce.
- Add
beforeunloadflush for draft reliability.
- Move input state tracking to
-
#37 Singleton Shiki highlighter (
Done)
Benefit: Prevent repeated WASM initialization and improve code block latency.
Sources:research/webclaw/06-recommendations.mdR04- Create module-level singleton
highlighterPromise. - Reuse singleton across all markdown/code renders.
- Validate no duplicate highlighter bootstrapping.
- Create module-level singleton
-
#38 Typography utilities (
Partial)
Benefit: Improve text readability with near-zero runtime cost.
Sources:research/webclaw/06-recommendations.mdR05- Apply
text-balance/text-prettyin.prosemarkdown context. - Extend utility classes to sidebar titles.
- Extend utility classes to standalone headings/components.
- Apply
-
#39 Pragmatic hook decomposition (
In progress)
Benefit: Improve maintainability/testability of complex chat flows.
Sources:research/webclaw/06-recommendations.mdR06 (adapted)- Extract
use-chat-operations.tsanduse-chat-draft.ts. - Extract
use-chat-submit.tsfromuse-chat-core.ts. - Extract
use-chat-edit.tsfromuse-chat-core.ts. - Add focused tests for extracted hooks.
- Extract
-
#40
typeoverinterface(Not started)
Benefit: Consistent type composition with unions/utilities.
Sources:research/webclaw/06-recommendations.mdR09- Add lint preference/rule for new files only.
- Document migration policy (no codemod, opportunistic adoption).
- Track adoption in new/changed modules.
-
#41 Context meter (
Not started)
Benefit: Show remaining context window before limit hits.
Sources:research/webclaw/06-recommendations.mdR08- Implement phase 1 estimate from visible message history.
- Implement phase 2 with accumulated
usage.promptTokens. - Normalize across multi-model conversations with differing limits.
-
#42 Global prompt auto-focus (
Not started)
Benefit: Remove click-to-focus friction in composer.
Sources:research/webclaw/06-recommendations.mdR07- Add global printable-key listener and focus guardrails.
- Exclude editable/interactive targets and modifier-key combos.
- Validate behavior across routes/modals/platforms.
-
Evaluate how to initiate new chats via text like OpenClaw (
Not started)- Research OpenClaw's text-triggered new chat initiation UX and interaction model.
- Evaluate feasible implementation patterns in NaW chat flow/composer.
- Propose recommended approach with trade-offs and rollout scope.
-
#45 Voice/dictation controls (
Not started)
Benefit: Enable speech-to-text input in the composer, matching ChatGPT's dictation UX.
Sources: ChatGPT prompt input research (design/chatgpt-reference/prompt-input/), Web Speech API, see also P2 #24 Audio STT/TTS- Add dictation button to composer trailing actions. (File:
app/components/chat-input/chat-input.tsx:244-280) - Implement Web Speech API
SpeechRecognitionwith start/stop/interim-result states. - Handle browser support detection and graceful fallback (hide button when unsupported).
- Stream recognized text into
PromptInputTextareavalue. (Integration:components/ui/prompt-input.tsx) - Add visual feedback (pulsing mic icon, waveform indicator) during active dictation.
- Coordinate with P2 #24 Audio STT/TTS for provider-based dictation (Whisper, Gemini) as a future upgrade path.
- Add dictation button to composer trailing actions. (File:
-
#46 Mobile-specific file inputs (
Not started)
Benefit: Expose camera capture and photo gallery picker on mobile, matching ChatGPT's mobile composer.
Sources: ChatGPT prompt input mobile research (design/chatgpt-reference/prompt-input/chatgpt-prompt-input-mobile-html.md)- Add hidden camera input (
accept="image/*" capture="environment"). (Ref: ChatGPT#upload-camera) - Add hidden gallery input (
accept="image/*" multiple). (Ref: ChatGPT#upload-photos) - Add "Take photo" and "Choose from gallery" items to
ButtonPlusMenu. (File:app/components/chat-input/button-plus-menu.tsx:127-181) - Gate camera/gallery items to mobile user-agents or touch-capable devices.
- Validate iOS Safari and Android Chrome capture behavior.
- Add hidden camera input (
Skipped from WebClaw research (premature for current team size/stage): full screen-based feature modules (R10), portal-based scroll container (R11 — not applicable, NaW uses
use-stick-to-bottomwith plain divs), pin-to-top scroll (R12, ship behind toggle if ever), unified message component (R13 — shared primitives already incomponents/ui/message.tsx, final unification deferred), cmdk replacement (R14), streaming batching (R15). Revisit when team scales or profiling justifies. Generation guard timer (R03) already implemented inuse-chat-core.ts.
Bug Fixes + Prompt-kit ← no deps (do first, P0)
Security Headers ← no deps (do first, P0)
Tool Calling Infra → Built-in Injection → Phase 7 → Code Execution (longest chain)
UX Redesign → Settings Page + Model Selector (P1 chain)
Audit Logging → OpenTelemetry
Inline Triggers ← no deps → Prompt Library (P2, deferred)
Memory System ← Convex vectors → RAG Pipeline (P2, shares infra)
Multi-chat Card Bug (#6) ← no deps (investigate P0)
Multi-chat Share Button (#7) ← no deps (investigate P0)
Edit/Resend Bug (#8) ← no deps (investigate P0)
Message Memo (#35) ✅ → Composer Ref (#36) ✅ → Hook Decomposition (#39, in progress) (P4 perf chain)
Shiki Singleton (#37) ✅ + Typography (#38, partial) + Auto-Focus (#42) ← no deps (P4 parallel)
Context Meter (#41) ← needs token accumulation from AI SDK usage object
Style Override Safety (#26) ← no deps (P2, affects select/dialog/alert-dialog/sheet/dropdown-menu)
Multi-chat Layout Redesign (#27) ← no deps (P2)
Evaluate Unused SDK Features (#28) ← no deps (P2, informs middleware/registry/telemetry adoption)
Motion Preference (#43) ← no deps (P1, accessibility)
Rich-text Composer (#44) → Inline Triggers (#20) (P2, enables rich input features)
Voice/Dictation (#45) → Audio STT/TTS (#24) (P4→P2, progressive enhancement)
Mobile File Inputs (#46) ← no deps (P4, mobile UX)
Details:
.agents/context/architecture.md| DB:.agents/context/database.md| API:.agents/context/api.md| Terms:.agents/context/glossary.md
All research:
.agents/research/
| Topic | Document |
|---|---|
| Provider history replay | .agents/research/provider-aware-history-adaptation.md |
| Open WebUI comparison | .agents/research/open-webui-analysis/SUMMARY.md |
| Latest models (Feb 2026) | .agents/research/latest-models-february-2026.md |
| Multi-tool calling | .agents/research/multi-tool-calling-system-design.md |
| Desktop/CLI access | .agents/research/desktop-cli-local-access-evaluation.md |
| Base UI migration | .agents/research/radix-to-base-ui-css-variables.md |
| WebClaw architecture & perf | .agents/research/webclaw/06-recommendations.md |