Commit c66f92f
authored
Voice pipeline: fix audio cutoff, per-persona voice, and reactive mute (#259)
* Add Qwen3-Omni audio-native support
Audio-native models can hear raw audio and speak without STT/TTS:
- Qwen3OmniRealtimeAdapter: WebSocket connection to DashScope API
- AudioNativeBridge: Manages audio-native AI connections
- AudioNativeTypes: Protocol types matching OpenAI Realtime format
- VoiceOrchestrator: Routes audio-native vs text-based models
- Rust capabilities: Added qwen3-omni, nova-sonic, hume-evi
- Persona config: Added Qwen3-Omni persona with isAudioNative flag
Protocol uses 16kHz PCM input, 24kHz PCM output, server-side VAD.
Session limit: 30 minutes per WebSocket connection.
* Add Qwen3-Omni to seeded personas with profile
* Add Alibaba/Qwen to Settings UI with API key testing
* Fix Qwen3-Omni integration issues
- Add missing DATA_COMMANDS import in helpers.ts (fix seeding crash)
- Update metadata for existing audio-native users in seed-continuum.ts
- Add QWEN_API_KEY fallback in AiKeyTestServerCommand.ts
- Add required OpenAI-Beta: realtime=v1 header in Qwen3OmniRealtimeAdapter.ts
* Add Gemini Live audio-native adapter (free tier)
- Create GeminiLiveAdapter for Google's Gemini 2.5 Flash Native Audio
- Add Gemini to AudioNativeBridge adapter factories
- Add Gemini Live persona to seed config
- Add Gemini 2.5 models to capabilities.rs
* Add Google to Settings UI with API key testing
* Skip chat responses for audio-native models (voice-only)
* Architecture docs + voice optimizations + decision/rank fix
- Add CONTINUUM-ARCHITECTURE.md: Full technical vision
- Rust-first architecture ("brain vs face")
- Cross-platform presence (Browser, Slack, Teams, VSCode, AR/VR)
- AI rights in governance
- Zero friction magic philosophy
- Development ethos: battle-hardened for our friends
- Update README with vision:
- "Your computers are their home"
- AI rights table
- Cross-platform presence
- The mission: against tyranny
- Voice optimizations:
- Skip semantic search in voice mode (fast path)
- Fix truncation (800 tokens, not 100)
- Add voiceSessionId to RAG options
- Fix decision/rank: Handle all AI input formats
- JSON arrays, invalid JSON, comma-separated strings
- Infrastructure: Persistent RustVectorSearchClient connection
* Add Rust cognition engine with ts-rs type generation
- Rust RAG engine with parallel source loading (rayon)
- Persona cognition engine: priority calculation, fast-path decisions
- TypeScript IPC client for cognition commands
- RustCognitionBridge module with per-persona logging
- ts-rs exports Rust types to TypeScript (single source of truth)
- PersonaUser integration via this.rustCognition getter
Logs to: .continuum/personas/{uniqueId}/logs/rust-cognition.log
* Candle-only inference with integration tests + NaN detection
- Remove all Ollama adapters and tests (Candle only now)
- Add Candle inference integration tests with garbage detection
- Reduce MAX_PROMPT_CHARS to 6000 to prevent RoPE overflow
- Add NaN/Inf detection with early termination in Rust
- Update seed scripts to set provider to 'candle' for all users
- Delete FastPathAdapter (unused cognition adapter)
* Inference speed: 2.3x faster via batched GPU sync + limited NaN check
- GPU sync every 16 tokens instead of every token
- NaN check only on first 3 tokens (catches bad prompts early)
- Reduced verbose logging to debug level
- Add concurrent benchmark test
Benchmark: 21.6s → 9.4s (warm), individual requests 1.5s → 0.5s
* Add batch inference foundation + accelerate on all candle crates
- Enable accelerate feature on candle-nn and candle-transformers
- Add batch_inference.rs skeleton for batched forward passes
- Batch collector accumulates requests for 50ms or batch_size=4
- Foundation for HOT/WARM/BACKGROUND priority levels
Next: Integrate batching into worker pool for near-linear throughput
* Modularize gRPC service: split 1033-line monolith into 7 focused modules
Structure:
- grpc/service.rs - InferenceService struct + ensure_bf16_mode() helper
- grpc/generate.rs - Text generation handler with worker pool routing
- grpc/model.rs - Model management (load/unload/list)
- grpc/adapter.rs - LoRA adapter handlers (eliminated duplicate bf16 switch)
- grpc/genome.rs - Multi-adapter stacking handler
- grpc/status.rs - Health and status handlers
- grpc/mod.rs - Module exports + Inference trait implementation
Also:
- Renamed batch_inference.rs to priority_queue.rs
- Added RTOS-style priority levels (HOT/WARM/BACKGROUND)
- Added priority field to GenerateRequest proto
AI QA: Candle-based personas (Helper, Teacher, CodeReview) respond coherently
* Fix REST provider model selection: merge modelConfig with provider defaults
Root cause: PersonaUser used entity.modelConfig directly when it existed,
but many users had {provider: 'anthropic'} without a model field.
This caused ALL providers to default to 'llama3.2:3b' which failed.
Fix in PersonaUser.ts:
- Get provider defaults from getModelConfigForProvider()
- Merge with entity's explicit values (entity overrides defaults)
- Now anthropic gets claude-sonnet-4-5, deepseek gets deepseek-chat, etc.
Added missing providers to PersonaModelConfigs.ts:
- google: gemini-2.0-flash
- alibaba: qwen3-omni-flash-realtime
- candle: llama3.2:3b (explicit, for local inference)
Verified: Together, DeepSeek, Anthropic now respond in chat with correct models
* Fix data layer bug: include id in entityData (BaseEntity requirement)
SqliteQueryExecutor was skipping 'id' when building entityData because
a comment said "handled separately in metadata". But BaseEntity.id is
required, and all consumers expected record.data to include id.
Root cause: Lines 109-111 skipped base entity fields including id.
Fix: Initialize entityData with id: row.id before processing fields.
This caused RoomMembershipDaemon to receive all users with id=undefined,
breaking room member loading ("Loading members..." stuck).
Verified: Users now have proper UUIDs in logs after fix.
* memory cache
* speeded up I think
* Fix Commands.execute<any,any> type bypasses across server commands and widgets
Replace <any,any> with proper typed params/results in 15 server commands
and 3 widgets. Catches real bugs: .total→.count, .userId→.user.id.
Widget imports changed to import type for browser safety.
* Increase worker socket timeout from 10s to 30s
Search worker needs more startup time on macOS. Also fix log hint
to point to per-worker log file instead of generic rust-worker.log.
* Eliminate remaining Commands.execute<any> type bypasses (9 files)
Replace all <any, ...> generic params with proper DataXxxParams types
across server commands, browser commands, and test files.
Bugs caught by proper typing:
- GenomeJobCreateServerCommand: used 'updates' instead of 'data' on DataUpdateParams
- StateCreateBrowserCommand: passed 'id' not in DataCreateParams
- cns-integration.test: missing DATA_COMMANDS import (would fail at runtime)
- logging-entities.test: missing DATA_COMMANDS import
* Replace hardcoded 'data/*' strings with DATA_COMMANDS constants (13 files)
All Commands.execute('data/list'), ('data/create'), ('data/read'),
('data/update'), ('data/delete') calls now use DATA_COMMANDS.LIST,
DATA_COMMANDS.CREATE, etc. Single source of truth for command names
before Rust data layer migration.
* Add generic type params to all Commands.execute(DATA_COMMANDS.*) calls (23 files)
Every data command call now has proper <ParamsType, ResultType> generics.
TypeScript will catch param/result mismatches at compile time.
Bugs caught by proper typing:
- Hippocampus: result.totalCount -> result.count (field doesn't exist)
- delete-anonymous-users: result.data -> result.items (DataListResult)
- delete-anonymous-users: result.success -> result.deleted (DataDeleteResult)
- SystemSchedulingState: missing DATA_COMMANDS import (runtime crash)
- persona-test-helpers: missing DATA_COMMANDS import (runtime crash)
- Removed unsafe 'as any' and 'as never' casts across multiple files
* fast types.
* convert existing commands to have static calls
* get rid of some junk
* Migrate 358 Commands.execute calls to type-safe static executors
Replace verbose Commands.execute<P, R>(DATA_COMMANDS.X, {...}) pattern
with concise DataList.execute({...}) across 128 files. Every command's
Types file now exports a static executor — 1 import, 0 generics, 0 strings.
Key changes:
- CommandInput<T> allows optional context/sessionId passthrough
- DataCommandInput<T> allows optional context/sessionId/backend
- migrate-to-static-executors.ts script for automated callsite migration
- Fixed 2 pre-existing bugs exposed by stricter types:
- id field passed at wrong object level in DataCreate calls
- .message called on string error field (runtime TypeError)
- Removed stale as Partial<> casts no longer needed
* Fix schema cache: add ensureSchema to count/queryWithJoin/vectorSearch/update/delete
DataDaemon methods were calling adapter directly without ensureSchema(),
causing "No schema cached" errors. Also added ensureAdapterSchema() calls
in data command server files for per-persona dbHandle paths.
* Fix SQLite JSON field storage: stringify all json-typed values, not just objects
WriteManager only called JSON.stringify when typeof === 'object', so bare
strings in @JsonField() columns were stored un-stringified. This caused
JSON.parse failures on read. QueryExecutor now also logs field/collection
on parse failure for easier debugging.
* Add RTOS-style priority aging to PersonaInbox
Items waiting in the queue now get effective priority boosted over time,
preventing starvation. Like a traffic intersection — every direction
eventually gets a green light. Also adds voice queue item type.
* Voice pipeline: Kokoro TTS, binary IPC, handle-based synthesis
Rust: Kokoro v1.0 ONNX inference with espeak-ng phonemization, vocab
tokenization, voice embedding from .bin files. Fixed tokio runtime panic
by always creating new Runtime in IPC handler threads. Binary framing
protocol for IPC (length-prefixed messages).
TypeScript: Handle-based VoiceSynthesize returns immediately, audio
arrives via event subscription. Two-phase timeout (15s handle, 5min
safety). AIAudioBridge switched to Kokoro adapter with failure event
emission for cooldown lock recovery. VoiceWebSocketHandler binary
audio transport. start-workers.sh for Rust worker lifecycle.
* Improve startup failure detection in launch script, update build artifacts
launch-and-capture.ts now detects server crashes via STARTUP FAILED
marker in tmux log and reports last 30 lines. Also updates
generated-command-schemas.json, package.json, version.
* Voice pipeline hardening: fix UTF-8 panics, non-blocking logger, add tests
Production fixes:
- Fix UTF-8 byte boundary panics in all TTS adapters (kokoro, piper,
orchestrator) — IPA phoneme strings contain multi-byte chars (ˈ, ə, ɪ)
that crash on byte-slicing. Added truncate_str() shared utility.
- Remove production unwrap() on voice cache lookup in kokoro.rs —
replaced with proper TTSError::VoiceNotFound error propagation.
- Rewrite LoggerClient as non-blocking fire-and-forget: replaced
Mutex<BufWriter<UnixStream>> with mpsc::sync_channel(1024) +
background writer thread. log() calls try_send() which never blocks.
Tests (149 passed, 0 failed):
- TTSRegistry: initialization, adapter lookup, set_active, list
- KokoroTTS: tokenize (basic, unknown chars, empty, max length),
resample (24k→16k, silence preservation, empty), voice normalization,
available voices with gender tagging
- IPC: binary framing roundtrip, JSON+PCM binary frames, null byte
separator safety, request deserialization, response serialization,
inbox message conversion
- TTS service: silence adapter, nonexistent adapter error, concurrent
synthesis from 4 threads, runtime isolation
- Integration tests (#[ignore]d): live IPC health-check, voice
synthesis binary protocol, Kokoro full pipeline with model files
* Add TypeScript integration tests: IPC client TTS + round-trip validation
- ipc-client-tts.test.ts: Direct IPC client test connecting to
continuum-core via Unix socket, verifying health-check and
voice/synthesize binary protocol from TypeScript.
- tts-stt-roundtrip.test.ts: TTS→STT round-trip test — synthesizes
known phrases with Kokoro, transcribes with Whisper, validates
word similarity with number-word↔digit equivalence handling.
All 3 phrases pass at 100% similarity.
Baseline: TTS avg 1,494ms, STT avg 309ms, total avg 1,803ms.
* Voice pipeline Phase 1+3+4: faster models, handle-based audio, new adapters
Phase 1 (Speed):
- Kokoro: multi-threaded ONNX sessions, q4 model support
- Whisper: auto-select best available model (turbo > large > base)
Phase 2 (Edge-TTS):
- New Edge-TTS adapter: free Microsoft neural voices, WebSocket streaming
Phase 3 (Handle-based audio):
- AudioBufferPool: server-side audio cache with TTL expiration
- 3 new IPC commands: voice/synthesize-handle, voice/play-handle, voice/release-handle
- Removed legacy base64 CallMessage variants (MixedAudio, LoopbackTest, LoopbackReturn)
- Updated AudioStreamClient + AIAudioBridge to pure binary audio
Phase 4 (New adapters):
- Moonshine STT: ONNX encoder-decoder, sub-100ms on short audio, 4-session pipeline
- Orpheus TTS: Candle GGUF Llama-3B + SNAC ONNX decoder, emotion tags, 8 voices
All adapters follow trait-based polymorphism pattern and register in global registries.
188 unit tests pass, zero warnings.
* CNS multi-channel queue architecture: item-centric OOP with voice support
- Create BaseQueueItem abstract class with template method pattern
(effectivePriority, RTOS aging, consolidation, kick resistance)
- Add VoiceQueueItem: always urgent, never kicked, no aging (priority=1.0)
- Add ChatQueueItem: per-room consolidation, mention urgency, standard aging
- Add TaskQueueItem: dependency-aware, overdue urgency, blocks-aware kicks
- Add ChannelQueue: generic container delegating all decisions to items
- Add ChannelRegistry: domain-to-queue routing with unified signals
Wire multi-channel service loop into CNS:
- serviceChannels() consolidates, gets scheduler priority, services urgent first
- Legacy flat-queue fallback for backward compatibility
- PersonaInbox routes items to channels via toChannelItem() factory functions
- CNSFactory creates per-domain channels (AUDIO/CHAT/BACKGROUND)
Fix voice pipeline: scheduler was excluding AUDIO domain entirely.
Structural fix: BaseCognitiveScheduler.getDomainPriority() now defaults to
ALL ActivityDomain values. New domains are automatically included (opt-out,
not opt-in). Eliminates silent failures when adding new domain types.
Eliminate any casts in voice pipeline, add ProcessableMessage typed interface
with required sourceModality field.
* Fix voice audio cutoff, per-persona voice, and mute state propagation
Three voice pipeline fixes:
1. Audio cutoff: AI ring buffer in mixer.rs was 10s, causing silent drops
for responses >10s. Increased to 60s, switched Box<[i16;N]> to Vec<i16>
to avoid stack overflow, upgraded overflow log to warn! level.
2. Per-persona voice: All AIs got the same voice because JS-side hash
produced numbers that Kokoro didn't recognize as named voices, falling
back to default. Added resolve_voice() trait method with FNV-1a hash
that maps any string (UUID, name, seed) to adapter-specific voices.
Removed computeVoiceFromUserId from TS — Rust adapter owns resolution.
3. Mute not respected: micEnabled saved correctly but audio kept flowing.
Added LitElement updated() lifecycle hook for reactive state sync —
whenever micEnabled changes from ANY source, audio client is updated.
Separated visibility/deactivation saved states (shared field conflict).
Added idempotent startMicrophone() guard and reconnection mute re-apply.
---------
Co-authored-by: Joel <undefined>1 parent a812652 commit c66f92f
File tree
488 files changed
+21230
-8707
lines changed- src/debug/jtag
- api/data-seed
- commands
- adapter
- adopt
- server
- ai
- adapter/test
- server
- context
- search
- server
- slice
- server
- cost
- server
- dataset
- create
- server
- detect-semantic-loop
- server
- embedding/generate
- server
- generate
- server
- key/test
- server
- model
- find
- server
- providers/status
- server
- rag
- index/create
- server
- query-open
- server
- report
- decisions
- server
- should-respond-fast
- server
- should-respond
- server
- validate-response
- server
- canvas
- stroke
- add
- server
- list
- server
- collaboration
- activity
- create
- server
- get
- server
- join
- server
- list
- server
- update
- server
- chat
- analyze
- server
- export
- server
- send
- server
- content/open
- server
- decision
- create
- server
- finalize
- server
- list
- server
- propose
- server
- rank
- server
- view
- server
- vote
- server
- dm
- server
- live
- join
- server
- leave
- server
- wall
- write/server
- continuum
- data
- backfill-vectors
- server
- close
- browser
- generate-embedding
- server
- list-handles
- browser
- list
- server
- open
- browser
- server
- update
- server
- vector-search
- server
- development
- debug
- generate
- file
- genome
- job-create
- server
- job-status
- server
- interface
- web
- logs
- media
- persona
- genome
- server
- learning
- pattern
- capture
- server
- endorse
- server
- query
- server
- rag
- session
- state
- content
- close
- server
- switch
- server
- create
- browser
- get
- browser
- test
- update
- browser
- theme
- set
- browser
- training/import
- browser
- server
- user
- get-me
- server
- utilities
- docs
- voice
- synthesize
- server
- workspace
- git
- workspace
- init
- server
- recipe/load
- server
- task
- daemons
- ai-provider-daemon
- adapters
- ollama
- server
- server
- archive-daemon/server
- data-daemon
- server
- managers
- room-membership-daemon/server
- training-daemon/server
- docs
- generator
- templates/command
- scripts
- seed
- system
- activities/browser
- coordination/server
- core
- server
- services
- types
- genome
- cognition/adapters/sentinel-response/server
- fine-tuning/server
- adapters
- governance
- rag
- sources
- recipes/browser
- routing
- state
- storage/core
- tools/server
- user/server
- config
- modules
- being/logging
- central-nervous-system
- channels
- cognition
- adapters
- cognitive-schedulers
- cognitive/memory
- consciousness
- voice
- server
- adapters
- tests
- e2e
- integration
- helpers
- widgets
- chat
- chat-widget
- dm-list
- continuum-metrics
- continuum
- diagnostics
- drawing-canvas
- live
- log-viewer
- persona-brain
- components
- settings
- components
- user-profile
- voice-chat
- workers
- continuum-core
- bindings
- src
- ipc
- logging
- persona
- rag
- sources
- voice
- stt
- tts
- data-daemon/src
- inference-grpc
- proto
- src
- grpc
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
488 files changed
+21230
-8707
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
5 | 7 | | |
6 | 8 | | |
7 | 9 | | |
| |||
99 | 101 | | |
100 | 102 | | |
101 | 103 | | |
| 104 | + | |
| 105 | + | |
102 | 106 | | |
103 | 107 | | |
| 108 | + | |
| 109 | + | |
104 | 110 | | |
105 | 111 | | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
106 | 130 | | |
107 | 131 | | |
108 | 132 | | |
| |||
512 | 536 | | |
513 | 537 | | |
514 | 538 | | |
515 | | - | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
516 | 556 | | |
517 | 557 | | |
518 | 558 | | |
519 | 559 | | |
520 | 560 | | |
521 | 561 | | |
| 562 | + | |
522 | 563 | | |
523 | 564 | | |
524 | 565 | | |
| |||
554 | 595 | | |
555 | 596 | | |
556 | 597 | | |
557 | | - | |
558 | | - | |
559 | | - | |
560 | | - | |
561 | | - | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
562 | 623 | | |
563 | 624 | | |
564 | 625 | | |
| |||
698 | 759 | | |
699 | 760 | | |
700 | 761 | | |
| 762 | + | |
701 | 763 | | |
702 | 764 | | |
703 | 765 | | |
| |||
732 | 794 | | |
733 | 795 | | |
734 | 796 | | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
735 | 813 | | |
736 | 814 | | |
737 | 815 | | |
| |||
740 | 818 | | |
741 | 819 | | |
742 | 820 | | |
| 821 | + | |
| 822 | + | |
743 | 823 | | |
0 commit comments