|
| 1 | +# Cast — LiveKit Media Streaming |
| 2 | + |
| 3 | +This document explains how live video and audio streaming works in the explorer via LiveKit rooms. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +The cast feature allows scenes to display live video/audio streams from LiveKit rooms. A scene places a `PBVideoPlayer` or `PBAudioStream` SDK component on an entity with a `livekit-video://` URL, and the explorer connects to the room and routes media to the in-world screen. |
| 8 | + |
| 9 | +Two player backends exist side by side: |
| 10 | + |
| 11 | +| Backend | URL scheme | Use case | |
| 12 | +|---------|-----------|----------| |
| 13 | +| **AvProPlayer** | `http://`, `https://` | Pre-recorded or HLS video | |
| 14 | +| **LivekitPlayer** | `livekit-video://` | Real-time room streams | |
| 15 | + |
| 16 | +The `MultiMediaPlayer` REnum wraps both behind a unified interface so the ECS systems don't care which backend is active. |
| 17 | + |
| 18 | +--- |
| 19 | + |
| 20 | +## Address Types — `LivekitAddress` |
| 21 | + |
| 22 | +`LivekitAddress` is an REnum (discriminated union) with two variants: |
| 23 | + |
| 24 | +### CurrentStream |
| 25 | + |
| 26 | +``` |
| 27 | +livekit-video://current-stream |
| 28 | +``` |
| 29 | + |
| 30 | +Picks the first available video track in the room — and then **follows the active speaker** (see [Active Speaker Tracking](#active-speaker-tracking-video-follows-voice) below). This is the default mode for streaming theatre screens. |
| 31 | + |
| 32 | +### UserStream |
| 33 | + |
| 34 | +``` |
| 35 | +livekit-video://{identity}/{sid} |
| 36 | +``` |
| 37 | + |
| 38 | +Pins to a specific participant's track by identity and stream ID. No automatic switching occurs. |
| 39 | + |
| 40 | +Defined in `LivekitAddress.cs`. Helper extensions in `LiveKitMediaExtensions.cs` handle parsing. |
| 41 | + |
| 42 | +--- |
| 43 | + |
| 44 | +## Video Routing |
| 45 | + |
| 46 | +### How the first video track is selected |
| 47 | + |
| 48 | +When `OpenMedia()` is called: |
| 49 | + |
| 50 | +- **CurrentStream** → `FirstVideoTrackingIdentity()` iterates all remote participants (under lock) and returns the first video track found. The participant's identity is stored in `currentVideoIdentity`. |
| 51 | +- **UserStream** → Directly opens the stream for the specified `(identity, sid)`. |
| 52 | + |
| 53 | +### Active Speaker Tracking (video-follows-voice) |
| 54 | + |
| 55 | +In `CurrentStream` mode, the video automatically switches to whoever is speaking. This is driven by `TryFollowActiveSpeaker()`, which runs every frame inside `EnsureVideoIsPlaying()`. |
| 56 | + |
| 57 | +**How it works:** |
| 58 | + |
| 59 | +1. `room.ActiveSpeakers` (provided by the LiveKit SDK) is an ordered collection of participant identities currently speaking — first element = highest audio level. |
| 60 | +2. Each frame, `TryFollowActiveSpeaker()` reads the dominant speaker. |
| 61 | +3. If the dominant speaker differs from the current video identity **and** enough time has passed since the last switch, the video stream is swapped. |
| 62 | + |
| 63 | +**Debounce:** A minimum hold time of **1.5 seconds** (`MIN_SPEAKER_HOLD_SECONDS`) prevents flickering during rapid speaker changes. |
| 64 | + |
| 65 | +**Fallback rules:** |
| 66 | + |
| 67 | +| Scenario | Behavior | |
| 68 | +|----------|----------| |
| 69 | +| Active speaker has no video track | Keep current video | |
| 70 | +| No one is speaking | Keep current video | |
| 71 | +| Rapid speaker changes (<1.5s) | Debounced — stays on current | |
| 72 | +| UserStream mode | No auto-switching (early return) | |
| 73 | + |
| 74 | +**Key methods in `LivekitPlayer.cs`:** |
| 75 | + |
| 76 | +- `FirstVideoTrackingIdentity()` — Selects first video track and records identity |
| 77 | +- `TryFollowActiveSpeaker()` — Core speaker-tracking logic with debounce |
| 78 | +- `FindVideoTrackForParticipant(identity)` — Looks up a participant's video track by identity |
| 79 | + |
| 80 | +--- |
| 81 | + |
| 82 | +## Audio Routing |
| 83 | + |
| 84 | +Audio is handled independently from video. |
| 85 | + |
| 86 | +### All tracks play simultaneously |
| 87 | + |
| 88 | +`OpenAllAudioStreams()` iterates **every remote participant** in the room and opens **every audio track** it finds. Each track gets its own pooled `LivekitAudioSource` from a `ThreadSafeObjectPool`. This means: |
| 89 | + |
| 90 | +- All participants' microphones are heard at once (like a conference call). |
| 91 | +- Audio is **not** tied to the currently displayed video — you always hear everyone. |
| 92 | +- Volume and spatial positioning are applied uniformly to all sources. |
| 93 | + |
| 94 | +### Spatial audio |
| 95 | + |
| 96 | +When the SDK component has `spatial = true`, audio sources are positioned in 3D space via `PlaceAudioAt(position)`. Min/max distance is configured through the SDK component fields. |
| 97 | + |
| 98 | +### Paired audio (reserved) |
| 99 | + |
| 100 | +`FindPairedAudio()` maps a video track to its companion audio track (camera → microphone, screenshare → screenshare audio). This exists for future use but is not currently active — all audio plays regardless. |
| 101 | + |
| 102 | +--- |
| 103 | + |
| 104 | +## Stream Recovery (Self-Healing) |
| 105 | + |
| 106 | +Both video and audio streams can die at any time (participant disconnects, network issues). The system self-heals via two methods called every frame from `UpdateMediaPlayerSystem`: |
| 107 | + |
| 108 | +### `EnsureVideoIsPlaying()` |
| 109 | + |
| 110 | +``` |
| 111 | +Video dead + UserStream mode → Fallback to CurrentStream (first available track) |
| 112 | +Video dead + CurrentStream mode → Re-open CurrentStream |
| 113 | +Video alive + CurrentStream mode → TryFollowActiveSpeaker() |
| 114 | +``` |
| 115 | + |
| 116 | +### `EnsureAudioIsPlaying()` |
| 117 | + |
| 118 | +``` |
| 119 | +Any audio source dead → Release all, re-collect all audio tracks |
| 120 | +All audio alive → No action |
| 121 | +``` |
| 122 | + |
| 123 | +This means if a participant leaves and rejoins, or a new participant joins, the audio will automatically pick them up on the next recovery cycle. |
| 124 | + |
| 125 | +--- |
| 126 | + |
| 127 | +## System Architecture |
| 128 | + |
| 129 | +### ECS Systems |
| 130 | + |
| 131 | +| System | Group | Responsibility | |
| 132 | +|--------|-------|---------------| |
| 133 | +| `CreateMediaPlayerSystem` | ComponentInstantiation | Detects new `PBVideoPlayer`/`PBAudioStream` components, creates `MediaPlayerComponent` with appropriate backend | |
| 134 | +| `UpdateMediaPlayerSystem` | SyncedPresentation | Drives playback each frame — calls `EnsureVideoIsPlaying()`, `EnsureAudioIsPlaying()`, handles volume crossfading, texture updates | |
| 135 | +| `CleanUpMediaPlayerSystem` | CleanUp | Disposes players when entities/components are removed | |
| 136 | + |
| 137 | +### Factory |
| 138 | + |
| 139 | +`MediaFactory` (built by `MediaFactoryBuilder` per scene) decides which backend to create based on the URL scheme. It holds a reference to the scene's `IRoom` from `IRoomHub`. |
| 140 | + |
| 141 | +### Component |
| 142 | + |
| 143 | +`MediaPlayerComponent` wraps a `MultiMediaPlayer` (which is either `AvProPlayer` or `LivekitPlayer`). It also tracks frozen-stream detection and audio visualization buffers. |
| 144 | + |
| 145 | +--- |
| 146 | + |
| 147 | +## SDK Integration |
| 148 | + |
| 149 | +### How a scene triggers streaming |
| 150 | + |
| 151 | +1. Scene SDK sends a `PBVideoPlayer` component with `src = "livekit-video://current-stream"` (or a specific user address). |
| 152 | +2. `CreateMediaPlayerSystem` picks it up, calls `MediaAddress.New()` which detects the `livekit-video://` prefix. |
| 153 | +3. `MediaFactory` creates a `LivekitPlayer` backed by the scene's LiveKit room. |
| 154 | +4. `UpdateMediaPlayerSystem` drives it every frame. |
| 155 | + |
| 156 | +### `getActiveVideoStreams` API |
| 157 | + |
| 158 | +Scenes can query available streams via `CommsApiWrap.GetActiveVideoStreams()`. The response includes: |
| 159 | + |
| 160 | +```json |
| 161 | +{ |
| 162 | + "streams": [ |
| 163 | + { |
| 164 | + "identity": "participant-id", |
| 165 | + "trackSid": "livekit-video://identity/sid", |
| 166 | + "sourceType": "VTST_CAMERA", |
| 167 | + "name": "Display Name", |
| 168 | + "speaking": true, |
| 169 | + "trackName": "video", |
| 170 | + "width": 1920, |
| 171 | + "height": 1080 |
| 172 | + } |
| 173 | + ] |
| 174 | +} |
| 175 | +``` |
| 176 | + |
| 177 | +A synthetic `current-stream` entry is always included, pointing to the first available participant. |
| 178 | + |
| 179 | +### CastV2 — Display Name Resolution |
| 180 | + |
| 181 | +Participants joining via castV2 (unauthenticated web viewers) may not have a `Name` field. Display name is resolved with this fallback chain: |
| 182 | + |
| 183 | +``` |
| 184 | +Participant.Metadata.displayName → Participant.Name → Participant.Identity |
| 185 | +``` |
| 186 | + |
| 187 | +Metadata is a JSON string parsed at query time. |
| 188 | + |
| 189 | +--- |
| 190 | + |
| 191 | +## Key Files |
| 192 | + |
| 193 | +| File | Role | |
| 194 | +|------|------| |
| 195 | +| `SDKComponents/MediaStream/LivekitPlayer.cs` | Core player — video/audio routing, speaker tracking, recovery | |
| 196 | +| `SDKComponents/MediaStream/LivekitAddress.cs` | `CurrentStream` / `UserStream` address REnum | |
| 197 | +| `SDKComponents/MediaStream/MultiMediaPlayer.cs` | Unified wrapper over AvPro and Livekit backends | |
| 198 | +| `SDKComponents/MediaStream/MediaPlayerComponent.cs` | ECS component holding the player | |
| 199 | +| `SDKComponents/MediaStream/Systems/UpdateMediaPlayerSystem.cs` | Per-frame system driving playback | |
| 200 | +| `SDKComponents/MediaStream/Systems/CreateMediaPlayerSystem.cs` | System creating players from SDK components | |
| 201 | +| `SDKComponents/MediaStream/Systems/CleanUpMediaPlayerSystem.cs` | Disposal system | |
| 202 | +| `SDKComponents/MediaStream/MediaFactory.cs` | Factory choosing backend by URL | |
| 203 | +| `SDKComponents/MediaStream/LiveKitMediaExtensions.cs` | URL parsing helpers | |
| 204 | +| `Infrastructure/.../CommsApi/CommsApiWrap.cs` | `getActiveVideoStreams` API | |
| 205 | +| `Infrastructure/.../CommsApi/GetActiveVideoStreamsResponse.cs` | Response builder with display name resolution | |
| 206 | +| `Multiplayer/Connections/Rooms/ParticipantExtensions.cs` | Address construction from participants | |
0 commit comments