Skip to content

Do not send replay frames to spectator server if initial begin play invocation failed#37159

Open
bdach wants to merge 2 commits intoppy:masterfrom
bdach:dont-send-frames-if-start-play-failed
Open

Do not send replay frames to spectator server if initial begin play invocation failed#37159
bdach wants to merge 2 commits intoppy:masterfrom
bdach:dont-send-frames-if-start-play-failed

Conversation

@bdach
Copy link
Copy Markdown
Collaborator

@bdach bdach commented Mar 31, 2026

RFC

Until now, if the initial BeginPlaySession() call failed, the client would continue operating as if it didn't - it would still continue to send frames and call EndPlaySession() at the end of a session.

Server-side, two things generally can happen after this:

  • The sent frames and the EndPlaySession() call are completely ignored as no-ops, or
  • A hub filter (like ClientVersionChecker) that failed the initial BeginPlaySession() call continues to fail the calls to SendFrameData() and EndPlaySession(), all the while creating a storm in logs, because it needs to throw HubExceptions to communicate to users that they need to update their game, and the exceptions can't be silenced from logs because they look like every other failure.

To that end, this has two goals: reduce useless network traffic, and reduce noise in spectator server logs after the client version checks were recently reactivated.

Probably needs tests, but unsure if everyone's going to be on board with this to begin with to be quite frank, so I'm leaving tests for when I'm told this needs tests.

…nvocation failed

Until now, if the initial `BeginPlaySession()` call failed, the client
would continue operating as if it didn't - it would still continue to
send frames and call `EndPlaySession()` at the end of a session.

Server-side, two things generally can happen after this:

- The sent frames and the `EndPlaySession()` call are completely ignored
  as no-ops:

  https://github.com/ppy/osu-server-spectator/blob/7bab117e9d161455485368f63a0607a9e53f9f8a/osu.Server.Spectator/Hubs/Spectator/SpectatorHub.cs#L122-L125
  https://github.com/ppy/osu-server-spectator/blob/7bab117e9d161455485368f63a0607a9e53f9f8a/osu.Server.Spectator/Hubs/Spectator/SpectatorHub.cs#L153-L157

- A hub filter (like `ClientVersionChecker`) that failed the initial
  `BeginPlaySession()` call continues to fail the calls to
  `SendFrameData()` and `EndPlaySession()`, all the while creating a
  storm in logs, because it needs to throw `HubException`s to
  communicate to users that they need to update their game, and the
  exceptions can't be silenced from logs because they look like every
  other failure.

To that end, this has two goals: reduce useless network traffic, and
reduce noise in spectator server logs after the client version checks
were recently reactivated.
@bdach bdach requested a review from peppy March 31, 2026 08:37
@bdach bdach self-assigned this Mar 31, 2026
@bdach bdach moved this from Inbox to Pending Review in osu! team task tracker Mar 31, 2026
@peppy
Copy link
Copy Markdown
Member

peppy commented Mar 31, 2026

LGTM

@bdach
Copy link
Copy Markdown
Collaborator Author

bdach commented Mar 31, 2026

Tests not looking very good, checking.

@bdach bdach added the blocked/don't merge Don't merge this. label Mar 31, 2026
@bdach
Copy link
Copy Markdown
Collaborator Author

bdach commented Mar 31, 2026

Above commit should do it I think, but I'll wait for CI to confirm before unblocking...

As per inline comment in said commit, one way to reproduce is

diff --git a/osu.Game/Online/Spectator/OnlineSpectatorClient.cs b/osu.Game/Online/Spectator/OnlineSpectatorClient.cs
index 22f8cfbf2b..a9f9de812c 100644
--- a/osu.Game/Online/Spectator/OnlineSpectatorClient.cs
+++ b/osu.Game/Online/Spectator/OnlineSpectatorClient.cs
@@ -59,10 +59,12 @@ protected override async Task<bool> BeginPlayingInternal(long? scoreToken, Spect
 
             Debug.Assert(connection != null);
 
+            await Task.Delay(10000).ConfigureAwait(false);
+
             try
             {
                 await connection.InvokeAsync(nameof(ISpectatorServer.BeginPlaySession), scoreToken, state).ConfigureAwait(false);
-                return true;
+                return false;
             }
             catch (Exception exception)
             {

For reference, logging with the aforementioned commit shows the following sequence of events:

[runtime] 2026-03-31 10:25:30 [verbose]: Clearing SpectatorClient state due to failed BeginPlayingInternal call.
[runtime] 2026-03-31 10:25:30 [verbose]: Frames arrived at SpectatorClient outside of gameplay scope and will be ignored.
[runtime] 2026-03-31 10:25:30 [verbose]: Frames arrived at SpectatorClient outside of gameplay scope and will be ignored.
[runtime] 2026-03-31 10:25:30 [verbose]: Frames arrived at SpectatorClient outside of gameplay scope and will be ignored.
[runtime] 2026-03-31 10:25:30 [verbose]: Frames arrived at SpectatorClient outside of gameplay scope and will be ignored.
[runtime] 2026-03-31 10:25:30 [verbose]: Frames arrived at SpectatorClient outside of gameplay scope and will be ignored.
[runtime] 2026-03-31 10:25:30 [verbose]: Frames arrived at SpectatorClient outside of gameplay scope and will be ignored.
[runtime] 2026-03-31 10:25:30 [verbose]: Frames arrived at SpectatorClient outside of gameplay scope and will be ignored.
[runtime] 2026-03-31 10:25:30 [verbose]: SpectatorClient dropping pending frames as the user is no longer considered to be playing.

@bdach
Copy link
Copy Markdown
Collaborator Author

bdach commented Mar 31, 2026

CI still 4 red, but this time it looks like I just got unlucky. Rerunning to be sure.


if (!isPlaying)
{
// it is possible for this to happen if the `BeginPlayingInternal()` call takes a long time,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question I pose is that what if it takes a long time but doesn't fail? We are potentially dropping frames from the start of a recorded replay... unless I'm missing something.

Copy link
Copy Markdown
Collaborator Author

@bdach bdach Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not looked into it. If it's a real shortcoming, it's already in master.

My hope based on what I know about signalr delivery guarantees would be that signalr side invocation ordering would delay / queue the client frame sending calls until the begin play call succeeds, but I've not tested. I can try testing it tomorrow.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference on master may be the clearing of frames which was added here. But yeah, let's investigate this one before pushing this out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Pending Review

Development

Successfully merging this pull request may close these issues.

2 participants