RnD sync — April 28 to May 5#861
Open
testradav wants to merge 731 commits intosecurity-union:RnDfrom
Open
Conversation
…ect configuratioin
- Replace unbounded HashSet with bounded VecDeque (max 16 entries) to prevent unbounded growth during long-lived sessions with many re-elections. Oldest entries are evicted when the cap is reached. - Remove ephemeral dev-session artifacts from settings.local.json (hard-coded pod name, overly broad Read permission).
Error logging on upload failure uses the unwrapped console.error (originals.error) which bypasses the collector, preventing a feedback loop during prolonged backend outages.
Prevents potential underflow if websocket_drop_count() ever resets. The counter is currently a static AtomicU64 (strictly monotonic), but saturating_sub is defensive and costs nothing.
Explains why 0.01 is used instead of f64::EPSILON — the PID integrator accumulates many small f64 additions, so rounding error can easily exceed machine epsilon.
The set is built across 3 phases separated by 250+ lines of tile rendering code. A future refactor could desync the pinned-peer insertion from the dedup check. Comments now anchor the contract.
macOS Finder uses "natural sort" (localizedStandardCompare) which treats digit runs inside hex strings as numbers, breaking the chronological ordering of UUIDv7-based filenames. Replace the hex UUID suffix with a zero-padded 5-digit chunk sequence counter (00001, 00002, ...) that sorts correctly in every file manager. Client sends X-Chunk-Seq header with each upload. Server zero-pads to 5 digits for the filename. sendBeacon fallback (which can't set headers) retains UUIDv7 suffix — these sort after numbered chunks.
…ecovery fix: PID integrator stuck at maximum with no recovery
Resolves conflict in dioxus-ui/src/components/waiting_room.rs VideoCallClientOptions: keep display_name.clone() from this branch and add is_guest from PR-staging's guest-support work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…am-c-ux-polish fix: tile layout bounce + GUID display name (Workstream C)
…ressure-aq-activation fix: enable adaptive quality for WebSocket-only users (OSS security-union#859)
…m/labs-projects/videocall into fix/owner-leaves-meeting
Resolves conflict in videocall-client/src/diagnostics/encoder_bitrate_controller.rs combining PR security-union#343's AQ_STATUS/AQ_BITRATE_CHANGE diagnostic logging with PR security-union#337's PID-stuck watchdog + per-tier-change pid.reset() calls. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…log-upload-collision fix: console log upload collision and retry storm
…aves-meeting When host ends meeting, meeting is closed for everyone
Resolves architectural conflict in dioxus-ui/src/components/attendants.rs: end_on_host_leave and allow_guests toggles from PR-staging ported into the extracted PreJoinSettingsCard component (PR security-union#340's design) rather than left inline. Also reverts dioxus-ui/scripts/config.js to match PR-staging - dev-only OAuth/WebTransport config changes are out of scope for this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Design join page in same view as other pages
…on-signal-user-id-routing fix: match CONGESTION signals against session_id history to survive re-election
…0-selective-decode fix: cap large-meeting decode load to active layout
Resolve screen_encoder and health_reporter conflicts: - Keep security-union#464's run_screen_encoding structure with start_with_stream - Apply encoder error counter statics/getters and increments inside run_screen_encoding - Keep security-union#513's HealthReporter::shutdown field/method/Weak ref pattern - Rename frames_emitted -> frames_submitted_ok (Tony review feedback) - Extract shared classify_encode_error helper with unit tests
…-auto-recovery feat(client): encoder auto-recovery on closed-codec / VPX memory errors
…-error-metrics — resolve encoder counter conflicts
…e-token-based-styling-and-replace-hardcoded-values Introduce token based styling and replace hardcoded values
…-error-metrics feat(client): encoder error counters and frames-emitted in health packet
…ion-lost-reason-v3 — add encoder counter fields alongside connection-loss fields
…ion-lost-reason-v3 feat(client): distinguish handshake failures from session drops in connection loss
[stub 784] Cropping lost when users join
Set ALLOW_ANONYMOUS to false by default
…ncoder-aq-metrics fix(client): restore camera encoder shared AQ metrics
Adds a per-connection counter of consecutive implausible-RTT discards (rejections from the `RTT_SANITY_MAX_MS` plausibility filter in `handle_rtt_response`). The 1Hz `check_rtt_degradation` watchdog now reads this counter on the active connection and triggers re-election when the streak exceeds `REELECTION_IMPLAUSIBLE_DISCARDS_THRESHOLD` (10). A single plausible measurement resets the streak. This is defense-in-depth alongside PR-A: without it, a broken time-base (client/server clock drift, NTP slew, virtualized-clock stretch) silently drains the existing RTT detector of samples and leaves the user stuck on a dead connection. Discussion security-union#539 documents one such incident: 255 implausible discards over 6 minutes on JRG_dirs (2026-05-05) where the server-side clock ticked ~2.6x the client performance.now() rate. Threshold of 10 means ~10 seconds of sustained brokenness at the 1Hz post-election probe rate before re-election fires - long enough to ride out a one-shot late ACK or NTP slew but short enough that the user does not perceive the connection as dead. With only one server configured the trigger is suppressed (re-election would just reproduce the same brokenness) and the streak is reset so we do not log on every tick. Verification: 8 new unit tests under `connection_manager::tests::implausible_*` and `*_implausible_*`, including 11-discard trigger, 10-discard non-trigger (boundary), 1+1+1 intermittent reset, single-server skip, and re-election-in-progress guard. All pass on the previous worktree HEAD; the lib test target is broken on PR-staging tip due to the unrelated security-union#526 issue. `cargo check` plus `cargo clippy --target wasm32-unknown-unknown -- -D warnings` are clean (default plus `--no-default-features`). Refs: discussion security-union#539
…ailable When the RTT-degradation watchdog fires but only one server is configured, the existing rebase path silently adapts the baseline to the degraded RTT and stays on the slow connection forever — even if conditions later improve. Tony Estrada and Anhelina hit this in meeting_sync on 2026-05-05 (discussion labs-projects/videocall#539): both rebased to >1s RTT and could only recover by manually rejoining. Adds a 30s re-election retry timer after each rebase (capped at 3 attempts, ~90s total budget). When the timer fires the manager re-evaluates whether the candidate set has grown — e.g. dioxus-ui called update_server_urls after refreshing the room token — and invokes start_reelection if so. If the URL list is still single-server it schedules another retry until the budget is exhausted. The counter is reset on reset_and_start_election and on a successful complete_election so each new session/election starts with a fresh retry budget. The retry honours the user transport preference. The dioxus-ui already exposes a TransportPreference context (Auto / WebTransportOnly / WebSocketOnly, persisted in localStorage / sessionStorage and resolved at URL-build time via resolve_transport_config). Manual WebTransportOnly / WebSocketOnly selections produce a single-candidate URL list by design, so the retry must not override that choice. To plumb this through the connection manager (which only sees the post-filter URL list), this PR adds allow_post_rebase_retry: bool to VideoCallClientOptions and ConnectionManagerOptions. The dioxus-ui passes transport_pref == TransportPreference::Auto for the main meeting client and false for short-lived observer clients (waiting room, guest join, meeting page). The async timer body is gated to target_arch = "wasm32" because gloo_timers and wasm_bindgen_futures::spawn_local panic outside the browser. The retry decision logic is factored into a pure decide_post_rebase_retry_action so the policy is host-test-safe. Tests: - post_rebase_retry_decision_* — pure policy assertions for every branch (Skip / FireElection / Reschedule). - post_rebase_retry_increments_counter_when_allowed — rebase under Auto preference advances the retry counter. - post_rebase_retry_not_scheduled_when_user_pref_forbids — manual transport preference suppresses the retry. - post_rebase_retry_caps_at_max_attempts — budget cap is honoured. - post_rebase_retry_counter_resets_on_reset_and_start_election — counter reset path is plumbed through the public API. All 54 tests in connection::connection_manager::tests pass on host (run locally; the peer_decode_manager test compilation blocker tracked as security-union#526 is independent of this change). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…eak doctests Addresses two blockers from the @jay-boyd / @Antonio-Estrada review of security-union#542: 1. update_server_urls now propagates from VideoCallClient -> ConnectionController -> ConnectionManager so the post-rebase retry's total_server_count() sees refreshed candidate URLs. Without this, dioxus-ui calling update_server_urls after a token refresh left the retry timer reading stale single-server URLs forever. 2. Updated lib.rs doctests to include allow_post_rebase_retry and is_guest so cargo test --doc -p videocall-client compiles. (The is_guest field is from a separate prior change but the doctest was doubly-broken.) Tests: - update_server_urls_propagates_into_total_server_count locks in the propagation - post_rebase_retry_decision_fires_after_url_propagation exercises the end-to-end invariant (single-server -> multi-server transition flips the retry decision) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ble-rtt-reelection-trigger fix(client): treat sustained implausible RTT as a re-election trigger
…failure When a re-election starts and ALL candidates fail before producing valid RTT samples (the JRG_dirs Tony S1 incident on 2026-05-05 15:05:47 UTC, discussion security-union#539), check whether the old active connection has had any inbound traffic within the last 5 s. If yes, the candidates' failure is treated as a transient relay-side outage and the old connection is preserved instead of being torn down. A 30 s re-election retry is scheduled to give the relay time to recover. Tony's S1 in JRG_dirs was on a healthy WebTransport connection (69 ms baseline RTT) when his RTT spiked to 1477 ms over 5 consecutive samples, triggering automatic re-election. Both new candidates (wt_0_g1 and ws_0_g1) failed handshake within 14 ms of each other (a brief relay-side outage). The election deadline lapsed 4 s later, the system declared "Election failed: No valid connections with RTT measurements found", and the user was force-disconnected. He was absent for 16 minutes before manually rejoining. The old wt_0 connection was still alive when the candidates failed (still receiving media + heartbeat). With this fix, the old connection stays put. Implementation: - Track last_inbound_at_ms per connection. Updated by the inbound media callback on every inbound packet (media, RTT echo, SESSION_ASSIGNED, heartbeat ACK). - New helper try_preserve_old_connection_on_candidate_failure runs in the Err branch of complete_election. Returns true if preservation applies; the caller skips the existing Failed-state emission. - Preservation conditions: re-election in progress, the reelection_preserved_once guard is false, the old active slot is populated, and the old connection's inbound freshness is within REELECTION_PRESERVATION_FRESHNESS_MS (5 s). - Preservation actions: restore the old connection to self.connections, restore its RTT measurement, set election state to Elected on the old id, rebase the degradation baseline, close failed candidates, schedule a REELECTION_PRESERVATION_RETRY_MS (30 s) retry, and emit Connected (NOT Failed). - Anti-loop guard: reelection_preserved_once is set on preservation, cleared only when a re-election cycle reaches a clean conclusion (Elected or aborted-on-no-improvement). If the 30 s retry's election ALSO fails total-candidate-failure, the guard forces the failure path to fall through to disconnect — guaranteeing a genuinely dead connection cannot be pinned indefinitely. Tests: - Fall-through when the old connection is silent past the 5 s window - Fall-through when no inbound traffic was ever recorded - Fall-through when not in a re-election (initial-election failure path) - Fall-through when the preservation guard is already set - 4.99 s vs 5.01 s freshness boundary checks - start_reelection retains only the old active's freshness entry - reset_and_start_election and disconnect both clear preservation state cleanly - Constant-value documentation tests so future tuning is intentional Refs: security-union#539 (JRG_dirs analysis), Tony Estrada S1 timeline 2026-05-05 15:05:47 UTC. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Addresses the blocker from the @jay-boyd / @Antonio-Estrada review of security-union#544. The flag was correctly cleared in reset_and_start_election:410 but missed in three other paths where re-election cleanly concludes: 1. start_reelection — after the in-progress early-return guard 2. complete_election abort-on-no-improvement 3. complete_election Elected success branch Without these clears, a 30s preservation-retry timer armed by a prior candidate-failure event could fire spuriously on a just-elected healthy connection. The reelection_in_progress guard inside start_reelection only absorbs the race when the new cycle is still running at retry time — once the new cycle completes, the stale timer wakes through and triggers unnecessary churn. Pattern matches the existing cleanup in reset_and_start_election. No new logic, just hoisting the cleanup invariant to the missing sites. Tests: - start_reelection_clears_pending_preservation_retry - complete_election_elected_branch_clears_pending_preservation_retry - complete_election_abort_no_improvement_branch_clears_pending_preservation_retry Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Resolves the additive conflicts from security-union#541 in: - videocall-client/src/adaptive_quality_constants.rs - videocall-client/src/connection/connection_manager.rs Both files were additive in adjacent regions; resolution = keep both sides. The PR-staging branch's REELECTION_IMPLAUSIBLE_DISCARDS_THRESHOLD constant and the corresponding import are kept alongside this branch's POST_REBASE_* constants and update_server_urls method.
…base-reelection-retry feat(client): post-rebase re-election retry when candidates were unavailable
…ailure Resolves conflicts introduced by security-union#542 [post-rebase re-election retry] landing on PR-staging. Both PRs are additive in adjacent regions of adaptive_quality_constants.rs and connection_manager.rs - struct fields, constructors, and test module. Resolution is "keep both" throughout; no semantic clash between security-union#542's post-rebase retry machinery and this PR's old-connection-preservation machinery. Verifications at the merge head: - cargo fmt --all -- --check [clean] - cargo check --target wasm32-unknown-unknown -p videocall-client [default + --no-default-features, both clean] - cargo clippy --target wasm32-unknown-unknown -p videocall-client -- -D warnings [default + --no-default-features, both clean] - cargo test --doc -p videocall-client [7/7 passing] Invariant greps post-merge: - 5 reelection_retry_pending clear-sites [this PR] - 22 consecutive_implausible_discards references [security-union#541, intact] - 3 update_server_urls definitions across the 3-layer plumbing [security-union#542] - All 5 expected test names from both PRs survived
…on-preserve-old-on-total-failure fix(client): re-election preserves old connection on total candidate failure
…ignal-quality popups Each client now stamps its active transport into the periodic HeartbeatMetadata (new TransportType enum, field 5). Receivers track the latest value per peer and forward it through the existing peer_status DiagEvent as a peer_transport text metric. Two UI surfaces consume it: - The diagnostics popup's Per-Peer Summary renders a compact WT / WS / em-dash pill next to each peer's buffer/jitter line. - The signal-quality popup (chart popover behind each peer's signal- bars icon) renders the same pill in its header next to the peer name. Signal writes are gated on actual change so heartbeat ticks (~1 Hz per peer) don't trigger UI re-renders. Backward compatible: peers on older clients arrive as TRANSPORT_UNKNOWN and render as em-dash. E2E coverage in diagnostics-peer-transport.spec.ts (3-browser, gated on >= 2 remote peers) and signal-quality-peer-transport.spec.ts (2-browser, opens popup via aria-label="Show signal quality"). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ic-tweaks feat(client,ui): per-peer transport (WT/WS) badge in diagnostics & signal popups
Collaborator
Author
|
Pushed a new |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
May 5
Encoder recovery & observability
feat: encoder auto-recovery on closed-codec / VPX memory errors — when
VideoEncoder.encode()or.configure()throwsInvalidStateError: closed codecor a VPXMemory allocation error, the encoder now tears itself down and restarts automatically, no user intervention (camera toggle) required. Camera encoder reacquiresgetUserMediaand cleans up partially-initialized capture resources before retrying; screen encoder reacquires a freshgetDisplayMediastream and re-emits aStartedevent so the UI rebinds to the replacement stream. Exponential backoff (500ms × min(attempt, 4), max 5 restarts before surfacingFailed); restart counter resets on the first successful encode so transient errors don't accumulate across long-lived sessions. Adds aCodecState::Closedguard before everyconfigure()call. Unit tests lock in the fatal-vs-non-fatal classification predicates and the screen-capture-reacquisition decision.feat: encoder error counters and frames-emitted in
HealthPacket— adds 10 new optionaluint64fields toHealthPackettracking encoder errors by class (closed_codec,vpx_mem_alloc,configure_fatal,generic) plusframes_submitted_okfor both camera and screen encoders. Closes a Prometheus blind spot:encoder_output_fpscould show steady 30 fps target while both encoders were dead. A new sharedclassify_encode_error(msg) -> EncodeErrorBuckethelper routes errors identically across both encoders. Counters are non-zero-guarded so clean-state packets stay small.feat: distinguish handshake failures from session drops in connection loss — adds
ConnectionLostReason::HandshakeFailed/SessionDroppedand threads it through the fullon_connection_lostcallback chain. Log lines now read[HANDSHAKE FAILED]or[SESSION DROPPED]instead of a single undifferentiated message. Two new health-packet counters. Transport-layerhandshake_completeflag set whenready()/onopenresolves; the close handlers consult it to classify. Triple-promisefiredguard for WebTransport coversready.catch/closed.then/closed.catch; an analogousws_firedguard for WebSocket covers the common Close+Error double-fire. Active-connection guard placed BEFORE counter increments so election-probe failures to losing servers don't inflate metrics.fix: restore camera encoder shared adaptive-quality metrics — the encoder-loop rewrite that introduced auto-recovery inadvertently deleted 12 of 14 public accessor methods on
CameraEncoder(shared video / audio tier index, encoder output fps, fps ratio, worst-peer fps, bitrate ratio, target bitrate kbps, tier transitions, climb-limiter snapshot, dwell samples, re-election completed signal). The encoder built clean in isolation (no internal callers) but the UI consumes all twelve, so the workspacecargo checkfailed at the rollup gate. This change restores both the API surface and the backing shared state + control-loop wiring (12 new struct fields, encoder loop writes to fps_ratio / worst_peer_fps / bitrate_ratio / target_bitrate_kbps every diagnostics tick, tier-transition draining, re-election signalswap(false, AcqRel)consume, climb-limiter snapshot updates, dwell-sample draining, WS self-congestion sliding-window check).Re-election & connection-preservation hardening
fix: treat sustained implausible RTT as a re-election trigger — the existing
RTT_SANITY_MAX_MSplausibility filter silently drops anomalous RTT measurements (e.g., a 2.6× clock-rate bug in the WebSocket heartbeat path), but the elevated-RTT degradation detector relies on accepted samples — so a streak of all-implausible measurements would never trigger re-election. Recent incident data showed 255 consecutive implausible discards over 6 minutes with no re-election ever firing. This change adds a per-connectionconsecutive_implausible_discardscounter; the 1Hz watchdog consults the counter on the active connection and fires re-election when the streak exceedsREELECTION_IMPLAUSIBLE_DISCARDS_THRESHOLD(10 — chosen for ~10s wall-clock, riding out one-shot late ACKs without thrashing). Defense-in-depth alongside the root-cause clock-drift fix; catches any future class of clock/time-base brokenness.feat: post-rebase re-election retry when candidates were unavailable — when degradation triggers a rebase but only one server is configured (a transient relay outage or a deliberate
WebTransportOnly/WebSocketOnlyuser pref), today's behavior is "rebase RTT baseline and never retry" — so a brief relay outage strands the user on the rebased connection forever. This change adds a 30s retry timer that re-evaluates candidate availability after each rebase and triggers re-election when the URL list grew (e.g., a fresh room-token refresh added back the alternate transport). Retry budget capped at 3 attempts (~90s wall-clock). The retry counter resets on a full reconnect or a successful election. The user transport preference is now plumbed throughVideoCallClientOptions→ConnectionManagerOptionsasallow_post_rebase_retry: bool, so deliberateWebTransportOnly/WebSocketOnlyselections never trigger an auto-recovery that overrides the user's choice.fix: re-election preserves old connection on total candidate failure — when automatic re-election fires and ALL new candidates fail their handshake before producing valid RTT samples, today's election-deadline path force-disconnects the user even though the original active connection is often still receiving media + heartbeat. Recent incident: a healthy WebTransport (69ms baseline RTT), 5-sample RTT spike triggered re-election, both new candidates failed handshake within 14ms of each other (relay was briefly unreachable), election deadline expired — user was kicked off a still-live connection and was absent 16 minutes. This change tracks
last_inbound_at_msper connection and, in the election-failure branch, runs a newtry_preserve_old_connection_on_candidate_failurehelper that restores the old connection + RTT measurement when its inbound freshness is ≤ 5s, schedules a 30s retry, and emitsConnected(NOTFailed). Areelection_preserved_onceguard prevents pinning a genuinely dead connection indefinitely — if the 30s retry's election ALSO fails, the guard forces the disconnect path.UI polish & dev hygiene
feat: token-based styling system replacing hardcoded values — three-tier token system:
dioxus-ui/static/tokens-v0.json(frozen contract, drift-checked) →global.css:rootblock (CSS source of truth) →dioxus-ui/src/theme.rs(Rust-side constants for SVG/charts). Replaces hardcoded color/spacing/effects values acrossstyle.css,signal_quality.rs,neteq_chart.rs,search_modal.rs,meeting.rs,attendants.rs,appearance_settings_panel.rs,guest_join.rs, androuting.rswithvar(--token-name)/theme::TOKENreferences. Two drift-check scripts (scripts/check-token-drift.shandscripts/check-hardcoded-colors.sh) gate against regressions, plus a new CI workflow that fires on every PR. Sets a foundation for future theming support; ships with a 220-line architecture doc atdioxus-ui/docs/styling-tokens.md.fix: crop button state persists across new-peer joins — crop-button state moved from DOM
class_list.add/removemanipulation to a DioxusCroppedTilesCtx: Signal<HashMap<String, bool>>provided atAttendantsComponentscope, surviving tile remounts and layout switches (regular grid → split screen-share → full-bleed).on_peer_removedprunes both thepeer_idandscreen-share-{peer_id}keys to prevent map growth across long meetings. Crop button is hidden when the peer's video is disabled; shows a green active state matching the mic icon. 20 new unit tests incanvas_generator.rs(4 crop-specific: toggle round-trip, cleanup-on-removal, missing-ID default-false, None-context default-false).chore:
ALLOW_ANONYMOUSnow defaults tofalsefor local development — flips the env-var fallback indocker/docker-compose.yamlandstart_dev.shfromtruetofalse, aligning local dev with production's already-secure default. Production unaffected; E2E unaffected (docker-compose.e2e.yamldoesn't set the var either, and E2E tests bypass auth via JWT cookie injection). New devs runningdocker-compose up/./start_dev.shwithout OAuth setup will need to explicitlyexport ALLOW_ANONYMOUS=trueto keep the unauthenticated fallback.feat: per-peer transport (WT/WS) badge in diagnostics & signal popups — each client now stamps its active transport into the periodic
HeartbeatMetadataproto (newTransportTypeenum, default 0 = unknown for forward/backward compat). Receivers track the latest value per peer and forward it through the existingpeer_statusdiagnostics event as apeer_transporttext metric. Two UI surfaces consume it: the diagnostics popup's Per-Peer Summary and the signal-quality popup header — both render a compactWT/WS/ em-dash pill per peer. Wire cost is ~2 bytes per heartbeat when set; 0 bytes for default/unknown via proto3 default-elision. Receive path usesenum_value().unwrap_or(...)so future enum variants from newer clients don't panic. Signal writes are gated on actual change so heartbeats don't trigger UI re-renders. Two new Playwright E2E specs cover both popup surfaces.May 1
In-meeting settings
on_loaded/on_audio_selected/on_video_selected/on_audio_output_selectedcallbacks — so no encoder restart or speaker switch fires as a side-effect of opening settings. New Playwright spec joins a meeting, captures mic/cam button states, opens and closes the modal, and asserts the states are unchanged.Home page UX
feat: unified meetings list with ownership-aware UI gating — replaces the separate "Previously Joined" and "My Meetings" sections on the home page with a single unified meetings list backed by a new
GET /api/v1/meetings/feedendpoint. Each row carries an "Owner" badge when the authenticated user owns the meeting, and edit / delete / end-meeting actions gate on a server-computedis_ownerflag — the only authoritative ownership signal in the response. The mutating handlers continue to enforcecreator_id == authenticated_user_idindependently, so the UI flag is cosmetic; the server is authoritative.perf: meetings feed folds participant counts into a single SQL round-trip — the new
db_meetings::list_feed_for_userquery computesparticipant_countandwaiting_countper row viaLEFT JOIN LATERALsubqueries, eliminating the prior 1+2N per-row pattern. Ships a partial-composite-index migrationidx_meeting_participants_meeting_id_admitted/_waiting(partial onmeeting_idfiltered by status) so each count resolves via index range scan instead of heap fetches under load. The user-side dedup is via the lateral aggregate (MAX(admitted_at)) — a meeting the user both owns and was admitted into appears once. Limit is hard-capped at 200 rows server-side; negative values rejected with400 INVALID_INPUT.test: three-layer coverage for the new feed and ownership-gating — backend integration tests (
meeting-api/tests/list_feed_tests.rs) covering owner-only, admitted-only, mixed-deduplication, and pagination boundary; Dioxus unit tests (dioxus-ui/tests/meetings_list_owner_gating.rs) asserting the owner badge and management actions render only onis_owner=truerows; Playwright spec (e2e/tests/meetings-ownership.spec.ts) for the ownership UI under realistic auth.April 29
Sign-out reliability
!user_id.starts_with("anon-")) to the data-load path on both the home page and the meeting page so a stale guest-session profile no longer enters theuser_profilesignal after sign-out. The home-page logout flow was rewritten to clear local state synchronously, navigate via the SPA router, and fire-and-forget a background fetch to/logoutfor backend session-cookie invalidation — avoiding the OIDCend_session_endpointredirect chain that was identified as the root cause of the "blank /logout screen" + "stuck anonymous" symptom. Includes an in-code DESIGN NOTE documenting the trade-off (the IdP session may stay alive after sign-out) and the long-term recommendation to drive the redirect chain from the backend after the SPA has already navigated away. The auth fast-pathcheck_session()no longer bails out on a stalevc_guest_session_idmarker when the deployment uses server-side OAuth — clears the marker and falls through to the network check.Video quality & transport
feat: adaptive initial-tier selection for screen share — replaces the fixed "always start at high (1080p / 2500 kbps)" warm-start with a pure decision function
initial_screen_tier(rtt_ms, camera_tier_index)that picks high / medium / low based on the network signals at the moment screen-sharing starts. RTT ≥ 400ms drops to low; RTT ≥ 200ms or camera already at sd/below drops to medium; otherwise high. Cold-start (no signals) keeps the existing optimistic high-tier default and lets the PID loop adapt. The chosen tier is applied to the encoder atomics before the encoding loop starts, so the very first encoded frame is at the right bitrate — readable text on constrained presenter uplinks without waiting for the PID to ramp down. The screen-share env-var defaultSCREEN_BITRATE_KBPSwas bumped from 100 (below the low-tiermin_bitrate_kbps = 250) to 1200 (matches medium-tierideal_bitrate_kbps).docs: comprehensive screen-share capacity-planning guide —
docs/server-sizing-guide.mdgains a new "Screen Share Bandwidth" section covering per-tier bitrate cost (high / medium / low), per-N relay-egress multiplier tables, mixed-mode scenarios (camera + screen + audio), VBR overshoot analysis (analytical 1.5–3× burst aboveideal_bitrate_kbpsduring scroll-heavy frames), three mitigation options (CBR for low tier, halvingideal, or NIC-headroom-only), and a precisechrome://webrtc-internalsmeasurement protocol with a "pending measurements" table for empirical validation.docs/Monitoring_Production.mdgains a "Screen Share Egress: Operator Callout" section with a Prometheus query (rate(relay_room_bytes_total{direction="outbound"}[1m])), aRoomEgressHighalert recommendation, and a 67%-fan-out rule of thumb for a 20-person meeting at high tier.Home page UX
feat: home page meeting-list polish and tooltip-driven density —
My MeetingsandPreviously Joinedrows are now single-line: meeting ID + state pill on the left, edit/delete on the right. All the metadata that previously cluttered each row — duration, time range, attendees, password, etc. — moved into a body-level tooltip portal that escapes everyoverflow: hidden/overflow-y: autoancestor, positions itself at the cursor with edge detection (flips when near the right or bottom edge), fades in over 120ms, and respectsprefers-reduced-motion. State pills are now title-case (Active / Idle / Ended) with subtle borders and lower-opacity backgrounds; the Owner badge gets the same softer treatment. Per-state tooltip layouts: Active shows Created-on / Started-on / Duration / Attendees / Waiting / Password; Idle shows Created-on / Last-active-on / Password; Ended shows Created-on / Last-active-on / Duration / Password; Previously-Joined shows the same as My Meetings, withCreated onshown only when the user owns the meeting.feat:
format_durationandformat_datetimehelpers —format_durationnow handles multi-day durations (1d 1h 1m 1sinstead of overflowing the hours field). Newformat_datetimehelper renders dates asApr 28, 3:07 PMfor the tooltip rows. Eight new host-target unit tests cover boundary cases (zero, sub-minute, minutes-and-seconds, hours-no-seconds, just-under-24h, exactly-24h, multi-day, exactly-48h).fix: meeting-list timestamps in milliseconds (was inconsistent seconds) —
list_meetingsandcreate_meetingAPI handlers were emittingcreated_at/started_at/ended_atin Unix seconds via.timestamp()while the rest of the meeting-API and the frontend treat these as milliseconds. Result: ended meetings rendered with identical start/end times and a duration of0s. Both handlers now emit ms via.timestamp_millis(), matching the already-correctget_meeting/end_meeting.JoinedMeetingSummarycarries a newcreated_at: i64(ms) so the Previously-Joined tooltip can render Created-on for owned meetings. Test coverage inmeeting_crud_tests.rslocks in the new ms-magnitude bound (MS_LOWER_BOUND = 1_000_000_000_000) fortest_list_meetings_successand a newtest_list_meetings_returns_ended_at_in_millisecondsthat exercises an idle → active → ended cycle.fix:
activate()now refreshes timestamps when reactivating an idle or ended meeting — surfaced during the home-page polish review.activate()previously only updatedstate = 'active', leavingstarted_atat the original-creation time andended_atstale. New SQL refreshesstarted_at = NOW()and clearsended_atonidle → activeandended → activetransitions while staying idempotent onactive → active(so other attendees joining don't bump timestamps). Newmeeting-api/tests/activate_semantics_tests.rslocks in the refresh-on-reactivation, theactive → activebyte-identical idempotency, thestarted_at >= created_atinvariant, andcreated_atimmutability across reactivation.Diagnostics
fix: refine per-peer diagnostics panels — seven targeted UX improvements to the
SignalQualityPopup: (1) legend help text bumped from6pxto8pt(was sub-pixel on most displays); (2) all open peer popups scroll together — scrolling any chart moves all others to the same time slot; (3) unchecking a metric in the legend also hides it from the hover tooltip; (4) "Latency" renamed to "Server RTT" throughout with help text clarifying it is client-to-server and identical for all peers in a session; (5) faint vertical grid lines at every 10-second mark on the chart; (6) RTT polyline made more subtle (40% opacity, thinner stroke, sparser dash) so it recedes behind the quality lines.feat: redesigned non-owner pre-join card — prominent
h2heading, monospace meeting-ID pill chip with ellipsis-on-overflow, clean subtitle, and a glass divider — replaces the prior generic "Ready to join the meeting?" text.Test coverage
test: Playwright E2E coverage for the four guest user-flows — three new Playwright spec files (
guest-leave.spec.ts,guest-rejection.spec.ts,guest-waiting-room.spec.ts) covering: guest-leave (host sees the tile removed within the grace period); guest-rejection (UI surface plus an API-guard assertion that the rejected guest's DB record carriesstatus=rejected, not just that the UI shows "Entry denied" — closes a class of regression where the UI updates but the DB row is wrong); sequential waiting-room admission with multiple queued guests; and anadmitted_can_admit=trueregression net that locks in the live-sync auth path (admitted non-host can admit a queued guest). Multi-browser orchestration consistent withspeaker-highlight.spec.ts— separatechromium.launch()per participant, per-context auth cookie injection,try/finallycleanup, andDate.now()meeting IDs to avoid parallel-run collisions.test: Playwright coverage for OAuth-fallback display-name handling — new
auth-display-name.spec.tscovers the three non-OAuth fallback scenarios (empty input on initial load with localStorage clear, localStorage restore on page load, direct-navigation pre-fill of meeting-page input from localStorage). Documents the OAuth-specific paths that can't be exercised underENABLE_OAUTH=falseas a known coverage gap.April 28
Security & access control
feat: per-user rate limiting and NATS packet sanitization for display-name changes — 5/min/user rate limit on display-name changes with a shared budget between rename and rejoin to close the leave-and-rejoin bypass; NATS forwarding sanitizes UTF-8, validates display names, and rewrites
room_idauthoritatively before publishing; static error messages prevent reflected-input info disclosure; client-side revalidation of server-sent name changes.feat: guest user flow with waiting room support — guests join via a
guest:{uuid}user_id, bypass OAuth, and enter through the waiting room. Includes DB migration,is_guestJWT claim, and UI integration.feat: meeting ends for all participants when host leaves — adds an
end_on_host_leavemeeting toggle; when the host disconnects with it on,MEETING_ENDEDis broadcast and the host tile cleans up before the overlay renders. Also closes a TOCTOU window for late joiners in thejoin_attendeetransaction.fix: admitted_can_admit now syncs live via NATS settings events — toggling host permissions mid-meeting now takes effect for already-joined participants. Adds a
MEETING_SETTINGS_UPDATEDprotobuf event; the client refetches status on receipt and updates host-control visibility reactively. Unauthenticated guests receive a read-only observer token (meeting-bound, 30-min TTL) for the refetch.Video quality & transport
perf: adaptive quality now works for WebSocket-only users — ports the WebTransport bounded-channel + CONGESTION-via-NATS pattern to WebSocket, plus client-side self-detection of WS backpressure that force-steps-down video quality on local drops.
fix: PID integrator stuck at maximum with no recovery — adds a 30-second saturation watchdog that force-resets the integrator when
pid_outputsits atPID_OUTPUT_MAXtoo long, breaking the feedback loop where reduced bitrate caused low received FPS which kept the PID saturated indefinitely.perf: content sharing starts at midpoint quality tier — screen-share now starts at the midpoint tier (720p / 8fps / 1200 kbps) instead of the maximum. The PID controller adapts in either direction — stepping up to 1080p when bandwidth is plentiful, or stepping down under congestion.
contentHint = 'detail'applied to the MediaStreamTrack for better codec behaviour on text and fine detail.fix: upgrade web-transport-quinn 0.8.1 → 0.11.9; fix WT inbound reception — fixes the
accept_unilost-waker bug that silently dropped inbound WebTransport streams. Corrects the inbound-read pattern fromread_to_endon persistent streams to an explicit length-prefixed frame loop. Validated by a 6-bot 50/50 WT/WS load test.fix: join_meeting no longer panics on unexpected state — eliminates a panic code path in join_meeting by returning an appropriate error result instead.
Participant grid & screen share
feat: stable join-time tile ordering with overflow-speaker promotion — tile sort is now deterministic by join time instead of speech activity, eliminating grid shuffling as people take turns speaking. Loud speakers in the overflow set swap into the visible set by displacing least-recently-active peers; density mode escalates automatically so every active speaker stays on-screen.
feat: resizable share screen area — draggable vertical handle (clamped 30–85%) between the shared content panel and the participants panel; tile grid collapses to 1 column when the right panel is narrow; stopping a share restores the normal grid without layout artifacts.
fix: new participants see active screen shares immediately — on join, a keyframe request (PLI) is fired on visibility transition so shared content renders immediately instead of staying blank for late joiners.
fix: improved tile grid layout during screen share — better tile placement and density when a screen share is active; resolves a layout collision between tile-ordering and screen-share code paths.
Settings & meeting controls
feat: meeting controls bar collapses and docks — control bar behaves like a Mac dock: primary buttons (mic/cam) always visible; secondary buttons (screen-share, peer-list, diagnostics, settings) collapse after 1 second of inactivity; the bar fades to 10% opacity after 4 seconds and reveals on hover or tap. User can dock the bar at bottom, left, or right.
feat: Appearance tab in Settings — speaker-highlight glow customization — each participant can choose glow color (preset palette or custom hex), tune outer/inner intensity, or disable the effect entirely. Debounced localStorage writes cancel on navigation to avoid write races. Shared calculation helper eliminates formula duplication between runtime and settings preview.
feat: sticky protocol selector with segmented control in Network settings — replaces the dropdown + immediate
confirm()dialog with a three-pill segmented control (Auto · WebTransport · WebSocket) and a deferred Apply button. A "Remember protocol choice" toggle (off by default) writes tolocalStoragefor persistence across restarts orsessionStoragefor the current tab only. Selecting Auto always clears both stores.Home page UX
feat: home page input validation and meeting creation overhaul —
Create a New Meetingrenamed toGenerate a New Meeting ID; the button renders exclusively withStart or Join Meeting(the two never co-exist — empty field shows Generate, filled field shows Start/Join). Generate populates the meeting-id field rather than navigating directly; the user clicks Start/Join to enter. Per-keystroke inline validation shows only when an invalid character is typed, reusing the canonicalis_allowed_display_name_charpredicate so client and server stay in sync. Info-icon tooltips provide on-demand allowed-character guidance with full keyboard/Escape/outside-click dismissal. Browser tab title cleaned up tovideocall.rs.fix: anonymous profiles no longer hide the sign-in button — with
ALLOW_ANONYMOUS=true, the backend returns a valid anonymous profile that was being treated as a logged-in user, permanently hiding the Sign In button. Fixed by filteringuser_id.starts_with("anon-")from the auth-dropdown condition.feat: previously-joined meetings on home page — new "Previously Joined" section showing the user's last 5 admitted meetings (owned or not) ordered by most-recent admission. Backed by a new
GET /api/v1/meetings/joined?limit=Nendpoint with anINNER JOIN meeting_participantsquery. State pills reuse the existingstate-active / state-idle / state-endedvocabulary; a gold owner badge marks owned rows. Expand/collapse state persists inlocalStorageper section across reloads.Other
feat: redesigned pre-join screen — matches the visual language of the rest of the app; settings toggles extracted into a
PreJoinSettingsCardcomponent for reuse.fix: speaker highlight glow resets correctly after speaking stops — tile border now returns to neutral after speaking stops (previously stayed at the glow color); host tile retains speaking glow during screen-sharing. Glow toggle visuals updated to iOS green (ON) / gray (OFF).
fix: console log upload CORS header and gzip compression — adds
X-Chunk-Seqto the CORSallow_headerslist (was rejecting upload POSTs after a successful OPTIONS preflight); console log chunks are now gzip-compressed at write time (~10:1 ratio on NDJSON text, ~10× less disk I/O per chunk).