Bevy avatar rendering + LiveKit video pipeline#274
Merged
Conversation
…ideo pipeline - ORMRustClient: Single socket → 4-connection pool with least-busy routing Eliminates IPC serialization bottleneck (all agents competed for 1 pipe) - Bevy headless renderer: VRM avatar models rendered at 5fps into render targets Single shared directional light (fixes Bevy 10-light limit) - LiveKit agents: Pre-created on room join with 500ms stagger (not on first speak) - LiveKit agents publish video tracks (640x480) from Bevy frame readback - LiveKit server: --node-ip 127.0.0.1 fixes ICE candidate negotiation on localhost - LiveJoinServerCommand: Stale call detection on server restart - LiveWidget: Video track attachment for remote participants - data-clear: Clears calls collection to avoid stale LiveKit room references
Human transcriptions from live calls were ephemeral (events only). Now saved as ChatMessageEntity with sourceModality:'voice' metadata. Cached callSessionId→roomId lookup avoids repeated DB queries. chat/export --room="general" now includes both text and voice messages.
Startup: - parallel-start.sh: concurrent TS + Rust builds (61s vs 130s) - system-stop.sh: comprehensive process cleanup (tmux, ports, sockets) - SystemOrchestrator: ping-based bootstrap check prevents white screen - start-workers.sh: --skip-build flag when caller already built Voice/LiveKit fixes: - WebRTC VAD: fix aggressiveness mapping (all levels were VERY_AGGRESSIVE) - ProductionVAD: add debug logging for speech detection pipeline - voice module: phased agent connection (STT first, 2s stagger between agents) Avatar rendering: - MAX_AVATAR_SLOTS 14->24, use constant instead of hardcoded slot check - Dark fallback for avatar tiles, video container CSS with fade-in
- Remove `= 'default'` from all 15 ORM method signatures — handle is now REQUIRED, compiler catches every missing handle - Fix 32 callsites across commands/daemons/system with explicit 'default' for shared-data operations - Fix persona handle propagation: Hippocampus opens longterm.db and now exposes the handle via waitForDbInit(). LimbicSystem.propagateDbHandle() pushes it to PersonaMemory after init. Fixes 14 personas thrashing main DB causing 27s IPC timeouts.
ts-rs 9→12 (eliminates 68 serde parse warnings), candle 0.8→0.9, rusqlite 0.32→0.38, tonic/prost 0.11/0.12→0.14, safetensors 0.4→0.7, fastembed 4→5, ort rc.9→rc.11, thiserror 1→2, tokio-tungstenite 0.21→0.28, half 2.3→2.7, hf-hub 0.4→0.5, tokenizers 0.20→0.22, ndarray 0.16→0.17
Split 1431-line monolith into orchestrator + 3 sub-components: - LiveParticipantTile: owns video-container in its own shadow DOM - LiveControls: SVG icons + media buttons, fires events up - LiveCaptions: multi-speaker transcription with auto-fade SCSS split into per-component files. LiveWidget retains session lifecycle, AudioStreamClient, state persistence, layout decisions.
- LiveParticipantTile: render videoElement directly in Lit template instead of imperative querySelector + appendChild - LiveWidget: bind all child state via Lit properties - Use repeat() directive for keyed participant lists - Use ref() directive for captions/controls refs
Generated .css, .styles.ts, .css.map files from compile-sass.ts should not be checked in — they're regenerated by npm start.
- Render at 1280x720 (was 640x480) for crisp video conference tiles - Disable WebRTC simulcast, set 2.5Mbps explicit bitrate to prevent adaptive compression blur on avatar video tracks - Load 3D VRM model at connect time, not deferred to first speech - Add deterministic_pick(id, options, salt) using FNV-1a hash for stable trait selection from any array given a unique ID - Derive avatar gender from persona identity when voice isn't known, so model selection is immediate and consistent - Remove transcription-to-chat persistence (transcriptions are live captions, not chat messages) and add browser-side dedup - Add ResizeObserver on tiles + data channel for future dynamic resize - Add ts-rs generated TileResolution/ResolutionTierWire types
… modular architecture - Avatar selection enforces gender coherence: avatar gender always matches voice gender - Single source of truth: gender_from_identity(user_id) seeds both avatar and TTS voice - Gender is a SEED default — designed for future override via user.state preferences - TTS gender_hint flows through entire chain: speak_in_call → tts_service → tts::synthesize - resolve_voice_gendered() filters voice catalog by gender before deterministic hash - FPS bumped 15→24, mouth weight windows 200ms→66ms, amplitude scaling 0.7→1.0 - Video encoding: 800kbps→1.2Mbps, 15fps→24fps for smoother lip sync - VRM 1.0 models (169 joints) filtered from catalog — Bevy can't render >128 joints - Modular avatar/ module: catalog, selection, gender, frame_analysis, backends, types - Health check at frame 150/300 detects Empty/BrokenGeometry (log only, no fallback) - 76 avatar tests + 6 TTS service tests passing
Reduces male model sharing from 4 duplicates to 1. 9 male agents now have 7 unique models to draw from. Both models are VRM 0.x with full mouth morph targets for lip sync. - wv-sakurada2: Sakurada Fumiriya variant (107 joints, M00_ materials) - wv-shilo: Hand-rigged male (85 joints, sideburn bones, 17 blend shapes)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
16 commits
Confirmed working at d101d32 — STT, subtitles, AI responses all function.
Test plan