Fix direct physics states observing outdated data when using separate physics thread#116897
Fix direct physics states observing outdated data when using separate physics thread#116897Sauroniux wants to merge 1 commit intogodotengine:masterfrom
Conversation
There was a problem hiding this comment.
This would also resolve #109213, which is essentially a duplicate of #113270, or the other way around I guess, but I did not realize this problem extended to things like child node transform propagation as well, which is much worse than just shape changes not being respected.
I mentioned a similar fix in #109213, but I must admit I'm hesitant to approve this, even if your change is arguably the correct and pragmatic thing to do here. The performance impact of this change can potentially be quite drastic, depending on the project.
Given how prevalent physics queries are in most games, and the fact that things like Viewport::_process_picking and NOTIFICATION_INTERNAL_PHYSICS_PROCESS runs right after _physics_process and physics_frame, you would pretty much be guaranteed to flush every single mutating PhysicsServer*D operation on the main thread now1, as opposed to the physics thread.
We would however still be running the actual simulation step on the physics thread at least, which I would imagine is the more costly part of the physics for most projects, so there would still be some benefit to running threaded I guess.
Footnotes
-
Except for things like transform changes of kinematic bodies I guess, since those happen at the very end of the physics processing. ↩
|
Good catch, I added the missing flushes to the PR. Yeah, I agree that this is a potentially big performance hit to ensure correctness. I suppose an alternative to this change could be to expose a manual flush (that is not a random getter)? That way, a user could call it to flush manually performed modifications to the physics engine. For example, in Unity, after you move a transform, you have to call A counterargument to that is that in issue #113270 the user isn't performing any manual changes themselves, the engine does. So it's probably not obvious that anything needs flushing. |
I just realized there was one more actually,
Interesting. Godot does have
Yeah, agreed, I don't think exposing some sort of manual sync/flush is the play here, unfortunately. If we really wanted to over-complicate things we could maybe have specific I doubt it would make much of a difference though, since the likelihood of there not being a single transform change at the end of physics processing (when Godot's own spatial queries happen) is probably very small. It also makes performance quite volatile and hard(er) to reason about. |
|
Fixed the
It's nice that the over-complication in this case can live in the One thing that comes to mind to regain and potentially improve the query performance, would be some form of asynchronous batch queries. LocalVector<RayParameters> parameters;
LocalVector<RayResult> results;
RID command_handle = PhysicsServer3D::get_singleton()->raycast_command(space, parameters, results);
// Some other work happens...
PhysicsServer3D::get_singleton()->ensure_finished(command_handle);
// Process the results...But then, the |
It would however necessitate changing
The problem isn't so much that the queries are slower, but rather that they force a flush of the entire command queue. Even if we managed to perfectly batch every single query done across the entire frame into a single batch, you're still stuck waiting for that batch at the end of the physics processing, along with the entire command queue, on the main thread. Anyway, I'll stew on this for a few more days, but I think we just need to accept this compromise for now. In fact, I'm almost tempted to say we should just be honest with ourselves and disable/bypass the command queue entirely between That would largely supersede this change though. |
I think it could still bring a tangible speedup, but that's a big feature on it's own, not really related to fixing this bug.
So that would basically mean changing the
That's fine, the goal is to find the best possible solution. |
Yes, exactly. |
Fixes #113270
I tried the reported issue on 4.4, and there I was getting raycast hits ~90% of the frames. On the latest versions, raycast misses 100% of the frames (except for the first frame).
I then tried bisecting it and arrived at pull request 109591.
After a bit more digging, the bug unfolds like this:
flush_queriesArea3Dgets a notification that its global transform has changed, and it sets the new transform to the server.Area3Dis parented to aRigidBody3Dphysics_processcallback is invoked and a raycast is performedThe proposed change flushes any pending commands when retrieving direct state access. This is the same behavior as calling any other server function that returns a value. With the change, spatial queries and other direct state functions are able to operate on latest data.
It's still possible to perform a spatial query on an outdated tree by:
But this change at least covers the bug that occurs without using any low-level access.