feat(tts): introduce `TTS` system by Fodesu · Pull Request #195 · memohai/Memoh

Fodesu · 2026-03-06T03:14:38Z

Summary

add TTS system as agent tool

Features

Introduce the TTS subsystem with a pluggable adapter interface and an Edge TTS backend (via Microsoft's edge-readaloud WebSocket API). - TtsAdapter interface with Capabilities/Synthesize/Stream - Edge WebSocket client matching readest's protocol (one-shot connection, big-endian audio frames, Sec-MS-GEC token generation) - AudioConfig with go-playground/validator struct tags - ParamConstraint supporting both discrete options and continuous ranges - Unit tests with mock WebSocket server, integration tests behind -tags=integration with audio file output Made-with: Cursor

…ntegration - Add tts_providers table, CRUD API, and Edge TTS adapter with WebSocket synthesis - Add TTS provider management page with voice/format/speed/pitch config and test synthesis - Add config schema support on provider meta for extensibility - Add tts_provider_id to bot settings for per-bot TTS configuration - Fix unsupported Edge TTS formats (ogg, audio-16khz) and improve WS error handling - Add 500-char text limit on test synthesis (frontend + backend)

…eb playback - Add text_to_speech agent tool that LLM can invoke when user requests voice - Stream TTS audio to temp file (StreamToFile) to keep memory usage low - Persist voice attachments via media.Service (content-addressed, per-bot container) - Extract TTS voice from tool_call_end events in ChannelInboundProcessor - Conditionally enable TTS action in resolver when bot has tts_provider_id - Add inline audio player in web chat UI for voice/audio attachments - Make VoiceConfig fields optional so adapters can use their own defaults

- Resolve merge conflicts between tts and main branches - Convert TTS tool from custom TS implementation to Go MCP ToolExecutor (internal/mcp/providers/tts/provider.go), aligning with main's server-side tool gateway pattern - Merge both tts_provider and browser_context features in DB schema, queries, settings DTOs, and Vue bot-settings UI - Move TTS frontend pages from packages/web/ to apps/web/ to match main's directory structure - Remove obsolete client-side AllowedActions/AgentAction enums - Delete old TS TTS tool (now served via MCP) - Regenerate sqlc, swagger, and SDK Made-with: Cursor

Restore TTS voice attachment extraction/rendering across inbound and web chat, renumber TTS migrations to 0028/0029, and fix golangci-lint findings.

Refactor TTS adapter interface to expose per-model capabilities (DefaultModel, Models, ResolveModel) and route synthesis through a specific model. Add tts_models table, auto-import models on provider creation, full model CRUD API, and update bot settings to select a TTS model instead of a provider.

Fodesu · 2026-03-07T17:24:43Z

Unfortunately, due to platforms not supporting streaming voice APIs, we are currently using blocking audio attachments instead.

sheepbox8646 added feat/server status/in-progress labels Mar 6, 2026

Fodesu added 5 commits March 6, 2026 14:46

refactor(tts): extract voices to JSON, add VoiceConfig with lang field

6363550

fix(tts): recover audio rendering and clean lint issues

d65ab1b

Restore TTS voice attachment extraction/rendering across inbound and web chat, renumber TTS migrations to 0028/0029, and fix golangci-lint findings.

Fodesu marked this pull request as ready for review March 7, 2026 14:50

Fodesu requested review from chen-ran and sheepbox8646 March 7, 2026 14:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tts): introduce `TTS` system#195

feat(tts): introduce `TTS` system#195
Fodesu wants to merge 7 commits intomemohai:mainfrom
Fodesu:tts

Fodesu commented Mar 6, 2026 •

edited

Loading

Uh oh!

Fodesu commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Fodesu commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Features

Related

Uh oh!

Fodesu commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fodesu commented Mar 6, 2026 •

edited

Loading