Skip to content

feat: enrich getActiveVideoStreams response#7347

Draft
cyaiox wants to merge 7 commits intodevfrom
feat/enhance-livekit-track-selector
Draft

feat: enrich getActiveVideoStreams response#7347
cyaiox wants to merge 7 commits intodevfrom
feat/enhance-livekit-track-selector

Conversation

@cyaiox
Copy link
Member

@cyaiox cyaiox commented Mar 5, 2026

Pull Request Description

What does this PR change?

Enhances the LiveKit media player to support Cast scenarios with multiple participants, richer stream metadata, and automatic recovery when tracks disconnect.

Changes

  • Enrich getActiveVideoStreams response, adding:
    • displayName
    • speaking
    • trackName
    • width
    • height
      to the JSON output, enabling scenes to display participant info and build stream-selection UIs
  • Multi-audio playback, plays all audio tracks from the LiveKit room simultaneously instead of just one, so multi-user Cast sessions produce sound from every participant
  • Auto-resolve paired audio, when opening a specific user stream, maps camera → microphone and screenshare → screenshare_audio to find the correct audio track
  • Auto-recover video when a user's video track ends, falls back to the first available stream (CurrentStream)
  • Auto-recover audio when detecting an audio streams ends, releases them, and re-collects all active audio tracks

Test Instructions

Prerequisites

  • Deploy the template: Streaming Theatre to .zone/.org

Test Steps

  1. Open the Admin tools smart item
  2. Start a new DCL Cast (click on the Activate button)
    a. Click on the Showcase button
    b. Should be an empty list
  3. Join to the cast as speaker (cast app)
  4. Click on the showcase button (Admin tools smart item)
    a. Should be one participant
    b. Click on the Showcase dropdown -> Select the camera option
    c. The Video screen should stream your camera
  5. From the cast app, share your screen
  6. Click on the showcase button (Admin tools smart item)
    a. Should be one participant
    b. Click on the Showcase dropdown -> Select the screen option
    c. The Video screen should stream your screen
    d. Stop sharing your screen
    e. The video screen should stream the first available stream(your camera)

Additional Testing Notes

  • Verify audio plays from all participants when multiple speakers join the Cast
  • Verify the participant name displays correctly for unauthenticated Cast users
  • Verify no audio leaks (sources properly released) after stopping/restarting a stream

Quality Checklist

  • Changes have been tested locally
  • Documentation has been updated (if required)
  • Performance impact has been considered
  • For SDK features: Test scene is included

Code Review Reference

Please review our Code Review Standards before submitting.

Screenshots

image

@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2026

@cyaiox cyaiox added the force-build Used to trigger a build on draft PR label Mar 5, 2026
@AnsisMalins
Copy link
Collaborator

@cyaiox cyaiox force-pushed the feat/enhance-livekit-track-selector branch from 4609f6b to 46fbe57 Compare March 6, 2026 19:40
@cyaiox cyaiox added force-build Used to trigger a build on draft PR and removed force-build Used to trigger a build on draft PR labels Mar 6, 2026
@AnsisMalins
Copy link
Collaborator

@cyaiox cyaiox force-pushed the feat/enhance-livekit-track-selector branch from 46fbe57 to dd905a9 Compare March 10, 2026 18:23
@cyaiox cyaiox added force-build Used to trigger a build on draft PR and removed force-build Used to trigger a build on draft PR labels Mar 10, 2026
@cyaiox
Copy link
Member Author

cyaiox commented Mar 10, 2026

@claude review

@claude
Copy link
Contributor

claude bot commented Mar 10, 2026

Claude encountered an error —— View job


I'll analyze this and get back to you.

@Ludmilafantaniella
Copy link
Contributor

Tested! Functionality works as expected ✅

TC5-6 TC3-4-evi TC1-2-evi
7347-evi.mp4

⚠️ Side note — cosmetic issues found:

  • "Showcase List" button label wraps to two lines instead of rendering on a single line.
  • Showcase camera dropdown: neither option shows a checkmark by default, even though the camera is already active when Cast is activated. The checkmark only appears after manually selecting an option. Feels a bit counterintuitive visually.
  • Horizontal scrollbar visible in the camera dropdown (already flagged — dev will fix by adjusting dropdown width).

None of these block the functionality, just leaving them on record!

11.03.2026_11.08.39_REC.mp4
image

@cyaiox cyaiox force-pushed the feat/enhance-livekit-track-selector branch from 390bb2e to 66e96cc Compare March 11, 2026 23:24
@cyaiox cyaiox added force-build Used to trigger a build on draft PR and removed force-build Used to trigger a build on draft PR labels Mar 11, 2026
cyaiox added 5 commits March 12, 2026 18:04
…r livekit user streams

Add participant metadata (name, speaking, trackName, width, height) to
GetActiveVideoStreamsResponse JSON output. Enable automatic audio track
resolution in LivekitPlayer when opening a user stream by pairing video
source to the correct audio source (camera→microphone, screenshare→screenshare audio).
Parse displayName from participant.Metadata JSON (used by unauthenticated
castV2 users), falling back to participant.Name then participant.Identity.
When a specific user stream track (livekit-video://identity/sid) is
unpublished, fallback to current-stream instead of retrying the same
dead address every frame. This lets EnsureVideoIsPlaying and
EnsureAudioIsPlaying self-heal by scanning for any available track.
Instead of pairing one audio track to the selected video track,
LivekitPlayer now plays every audio track in the room — so all
participants are heard regardless of which video is displayed.
When LivekitPlayer plays in CurrentStream mode, the video now follows
the dominant speaker using room.ActiveSpeakers with a 1.5s debounce
to prevent flickering. UserStream mode remains pinned to the specified
participant. Also adds docs/cast.md documenting the full cast feature.
@cyaiox cyaiox force-pushed the feat/enhance-livekit-track-selector branch from 43e595a to ccdc25a Compare March 12, 2026 21:04
@cyaiox cyaiox added force-build Used to trigger a build on draft PR and removed force-build Used to trigger a build on draft PR labels Mar 12, 2026
@m3taphysics
Copy link
Collaborator

@claude review

@claude
Copy link
Contributor

claude bot commented Mar 17, 2026

Claude finished @m3taphysics's task in 2m 53s —— View job


Code Review

  • Gather context
  • Read changed files
  • Analyze for bugs, security, and performance issues
  • Post review findings

Issues Found

🔴 Performance — EnsureAudioIsPlaying polls participants lock every frame when room has no audio (line 100)

When audioSources.Count == 0 (room has no audio tracks, or after initial open), anyDied is false and the guard condition !anyDied && audioSources.Count > 0 evaluates to false. The method falls through every frame to OpenAllAudioStreams(), which acquires lock(room.Participants) and iterates all participants. For any scene where audio is temporarily unavailable, this is a per-frame lock + allocation in a hot path. Fix this →

🔴 Audio disruption on every video recovery (line 71)

When a user stream's video dies, EnsureVideoIsPlaying() calls OpenMedia(LivekitAddress.CurrentStream()). OpenMedia always calls CloseCurrentStream() first, which calls ReleaseAllAudioSources() — pooling all audio MonoBehaviours — then immediately re-acquires them. This causes a brief audio dropout on every video fallback, even when audio streams are healthy. The video and audio reset paths need to be separated. Fix this →

🟡 Dead code — FindPairedAudio never called (line 255)

Private method with no callers. Violates the project's code standards: "Don't design for hypothetical future requirements." Remove until actually needed. Fix this →

🟡 Silent JsonException swallowing (line 39)

catch (JsonException) { } in ResolveDisplayName silently discards all metadata parse failures. Malformed participant metadata would silently fall back to Name/Identity with no trace in logs, making it impossible to diagnose in production. Log a warning via ReportHub. Fix this →

🟡 asCurrent values written but never read in CommsApiWrap.cs

asCurrent is used only as a null-check sentinel — the stored (identity, track) tuple is never read back. This is misleading; it should be a plain bool hasWrittenCurrent = false. Fix this →


REVIEW_RESULT: FAIL ❌

}
}

if (!anyDied && audioSources.Count > 0) return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance bug — participants lock acquired every frame when room has no audio

When audioSources.Count == 0 (e.g. room has no audio tracks yet, or after Stop()), anyDied stays false, so !anyDied && audioSources.Count > 0 evaluates to false and the method always falls through to ReleaseAllAudioSources() + OpenAllAudioStreams(). OpenAllAudioStreams acquires lock(room.Participants) and iterates all participants — every frame, for the lifetime of the player when no audio is present.

Fix: track a separate flag for "attempted to open audio, found nothing".

Suggested change
if (!anyDied && audioSources.Count > 0) return;
if (!anyDied && audioSources.Count > 0) return;
if (audioSources.Count == 0 && !anyDied)
{
// No sources and nothing died — room has no audio tracks.
// Only re-check when the frame budget allows (or when first opened).
OpenAllAudioStreams();
return;
}

Simpler fix: add an audioInitialized bool that is set to true in OpenMedia/OpenAllAudioStreams, and only skip the redundant re-poll when audioInitialized && audioSources.Count == 0.

/// Finds the audio track paired to a specific video track from the same participant.
/// Available for future targeted audio scenarios.
/// </summary>
private Weak<AudioStream> FindPairedAudio(string identity, string videoSid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead code — FindPairedAudio is never called

This private method has no callers. Per the project's code standards: "Don't create helpers, utilities, or abstractions for one-time operations. Don't design for hypothetical future requirements."

The docs note it's reserved for "future targeted audio scenarios", but shipping dead code adds maintenance burden and cognitive noise. Remove it until it's actually needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept it in case it will be required in the future, as livekit sends Video and Audio in separate tracks, this utility helps to match the participant Video with their respective Audio.

// If a specific user stream died, fallback to current-stream (first available track)
if (playingAddress.Value.IsUserStream(out _))
{
OpenMedia(LivekitAddress.CurrentStream());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Audio disruption on every video fallback

OpenMedia(LivekitAddress.CurrentStream()) calls CloseCurrentStream() first, which calls ReleaseAllAudioSources() — releasing and re-pooling all audio MonoBehaviour objects — then immediately re-acquires them in OpenAllAudioStreams(). This causes a brief audio dropout on every video recovery cycle, even when the audio streams are perfectly healthy.

Consider splitting the video and audio reset paths so a dying video track doesn't interrupt audio that is still alive:

Suggested change
OpenMedia(LivekitAddress.CurrentStream());
// Video stream died — swap to first available track without resetting audio
if (playingAddress.Value.IsUserStream(out _))
{
currentVideoIdentity = null;
currentVideoStream = FirstVideoTrackingIdentity();
playingAddress = LivekitAddress.CurrentStream();
videoSwitchedAtTime = UnityEngine.Time.realtimeSinceStartup;
playerState = PlayerState.PLAYING;
return;
}

if (!string.IsNullOrEmpty(displayName))
return displayName;
}
catch (JsonException) { }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent exception swallowing — invalid metadata is silently ignored

catch (JsonException) { } discards all parse failures without any trace. If a participant sends malformed metadata this will silently fall through to the Name/Identity fallback, making it very hard to diagnose in the field.

Suggested change
catch (JsonException) { }
catch (JsonException e) { ReportHub.LogWarning(ReportCategory.COMMS, $"Failed to parse participant metadata for display name: {e.Message}"); }

@cyaiox cyaiox marked this pull request as ready for review March 17, 2026 12:49
@cyaiox cyaiox requested review from a team as code owners March 17, 2026 12:49
… rescan, log metadata parse failures

- Add ReopenVideoStream() to recover video without tearing down audio
- Throttle empty-audio rescan to 2s interval to avoid per-frame lock
- Replace silent JsonException catch with ReportHub.LogWarning
@cyaiox cyaiox added force-build Used to trigger a build on draft PR and removed force-build Used to trigger a build on draft PR labels Mar 18, 2026
@cyaiox
Copy link
Member Author

cyaiox commented Mar 18, 2026

@claude review

@claude
Copy link
Contributor

claude bot commented Mar 18, 2026

Claude finished @cyaiox's task in 5m 33s —— View job


Code Review

  • Gather context
  • Read changed files
  • Analyze for bugs, security, and performance issues
  • Post review findings

Progress since last review

The three critical issues from the previous review are now resolved:

  • ✅ Per-frame lock when no audio — throttled with EMPTY_AUDIO_RESCAN_INTERVAL_SECONDS
  • ✅ Audio disruption on video recovery — ReopenVideoStream path no longer resets audio
  • ✅ Silent JsonException — now logs via ReportHub.LogWarning

Remaining Issues

🔴 Race condition — room.ActiveSpeakers iterated without a lock (LivekitPlayer.cs:225)

Every other participant collection access in this file is guarded with lock (room.Participants). room.ActiveSpeakers is iterated bare in TryFollowActiveSpeaker. If the LiveKit networking thread mutates it concurrently the foreach can throw InvalidOperationException or produce torn reads. Needs a lock, a snapshot, or a confirmed note that the collection is inherently thread-safe. (inline comment)

🟡 Dead code — FindPairedAudio has no callers (LivekitPlayer.cs:286)

Still present from the previous review. Per CLAUDE.md: "Don't design for hypothetical future requirements." A method that is never called must be updated whenever surrounding types change, and it misleads call-graph readers. (inline comment)

🟡 Misleading sentinel — asCurrent tuple in CommsApiWrap.cs

(string identity, TrackPublication publication)? asCurrent = null — the stored identity/track values are assigned but never read; only the == null guard is used. Replace with bool currentWritten = false to match intent. (Pre-existing line, not part of the diff — cannot add inline comment.)


REVIEW_RESULT: FAIL ❌


string? dominantSpeaker = null;

foreach (string speakerIdentity in room.ActiveSpeakers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential race condition — room.ActiveSpeakers iterated without a lock

Every other participant collection access in this file uses lock (room.Participants). room.ActiveSpeakers is a shared collection updated from the LiveKit networking thread, but it is accessed here without any synchronisation. If it is mutated concurrently the foreach can throw InvalidOperationException or silently skip entries.

Either wrap in a lock (matching whatever lock ActiveSpeakers uses), use a snapshot, or confirm in a code comment that the collection is already thread-safe. Fix this →

/// Finds the audio track paired to a specific video track from the same participant.
/// Available for future targeted audio scenarios.
/// </summary>
private Weak<AudioStream> FindPairedAudio(string identity, string videoSid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead code — FindPairedAudio has no callers

Per CLAUDE.md: "Don't create helpers, utilities, or abstractions for one-time operations. Don't design for hypothetical future requirements."

The XML doc says "available for future targeted audio scenarios" but shipping uncalled private methods adds maintenance burden: it must be kept compiling, updated if surrounding types change, and confuses readers who try to trace call graphs. Remove until there's a concrete call site. Fix this →

OpenAllAudioStreams now skips tracks that already have a source, so it
can be called periodically without disrupting healthy audio. Dead sources
are removed individually. EnsureVideoIsPlaying also drives audio discovery
since PBVideoPlayer entities don't trigger UpdateAudioStream.
@cyaiox cyaiox added force-build Used to trigger a build on draft PR and removed force-build Used to trigger a build on draft PR labels Mar 18, 2026
@cyaiox cyaiox marked this pull request as draft March 20, 2026 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

force-build Used to trigger a build on draft PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants