-
Notifications
You must be signed in to change notification settings - Fork 24
Move score submission to spectator server off of web #367
Description
The end goal here is to make the core gameplay loop of the client as unaffected by web dying as possible. Other online stuff is presumed to be allowed to be broken.
Note that this really only moves the single point of failure for the core gameplay loop from web to spectator server and I am very aware of this.
The submission flow in web is quite sprawling, with a bunch of safeties on it for safety's sake. Therefore I imagine any of this would start with a PoC with all of the safeties off. The safeties will then be brought up, but the PoC would establish viability.
If any of this succeeds and we get to something shippable, I think a gradual rollout with potential fallback to the old path client-side would probably be best. It's a big "if" though.
Considerations
Client reconnection must work
#193 must be closed as a prerequisite. Starting this makes zero sense without it.
Evaluate whether signalr messages are a reliable enough vector for submission requests, or if we need HTTP instead
It is not clear to me at this time whether we can rely that well on signalr messaging primitives for submission, or if we need a normal ASP.NET API next to signalr hubs for proper ordering guarantees.
I'd likely need to research and test at least the ordering & delivery guarantees of signalr (or defer to pre-existing knowledge if someone else has already tested this).
Figure out how to address Russian blocks
Users (not only, but primarily) from Russia frequently report issues connecting to spectator server. So far we've been ignoring this as the biggest "impact" this has is not having any replays for these users, which is in the "bad-but-not-critical" territory.
If we move the entirety of submission to spectator server then that would mean essentially turning the game off for these users unless they VPN.
A solution that might work is sunsetting the spectator.ppy.sh domain and using something under osu.ppy.sh, as some people claim the block is domain-based. That claim kind of holds up because people also report stuff like beatmap mirrors stop working, which are indeed under different domains.
This would require testing with affected users. Not even sure whether it is doable infrastructurally with how everything else is set up (cc @ThePooN).
Likely need a resilient backing store
Thus far killing spectator server only meant that some replays get dropped, which again is "bad-but-not-critical". If we move score submission to spectator server, killing it will means people get their scores fully dropped. As to how acceptable this is, it's in the eye of the beholder, but I imagine it will rile people up if we ever drop scores.
So we likely need something else resilient backing this. Maybe a redis although I'm not sure about that as I'm very cautious on even approaching redis these days due to prior complaints. Maybe a kafka even.
Opportunity to handle submission retry transparent to users
If we can get the resilient backing store online then maybe we can implement submission retry in a way that requires no client-side support. On a quick think, most of everything I typed up in ppy/osu#24609 (comment) would be obviated by a server-side retry.
That said, this only again works if we can have reliability of delivery of messages to spectator server. Even if all that the "submit score" operation or endpoint does is drop the score into a redis or kafka queue to be processed, this queueing still has to always succeed and never drop anything. Many of the preceding points are a prerequisite for this; if #193 turns out having a reliable solution that preserves message ordering on reconnection, and we can safely use signalr messages as transport for submission, then that would maybe resolve that concern and all that is left is a resilient backing store.
Scaling
Thus far all the scaling that's been done with spectator server is vertical. None of horizontal has been attempted.
Whether we want to change that in light of this possible effort is very much up in the air and I'm not even sure I understand the full considerations or implications myself. It'd turn this already-large endeavour into an even larger one because none of the existing hubs are written in a way that accommodates horizontal scaling.
It is however a possible factor in any technical decisions that will be made here (including the decision as to whether any of this should even be attempted), so I am mentioning it explicitly.