chore: optimize avatar bone matrix calculation pipeline#7604
Open
chore: optimize avatar bone matrix calculation pipeline#7604
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
|
Windows and Mac build successful in Unity Cloud! You can find a link to the downloadable artifact below. |
…when size increase
NickKhalow
approved these changes
Mar 19, 2026
Explorer/Assets/DCL/AvatarRendering/AvatarShape/Components/AvatarTransformMatrixJobWrapper.cs
Outdated
Show resolved
Hide resolved
DafGreco
approved these changes
Mar 23, 2026
DafGreco
left a comment
There was a problem hiding this comment.
✔️ PR reviewed and approved by QA on both platforms following instructions playing both happy and un-happy path
Regressions for this ticket had been performed in order to verify that the normal flow is working as expected:
- [✔️ ] Backpack and wearables in world
- [ ✔️] Emotes in world and in backpack
- [✔️ ] Teleport with map/coordinates/Jump In
- [✔️ ] Chat and multiplayer
- [ ✔️] Profile card
- [✔️ ] Camera
- [✔️ ] Skybox
Prod environment and PR have been compared mutually and there are more FPS in the PR than in prod as expected when initiating 300 avatars.
Plus , all the points of the PR have been checked and there are no issues in order to get this PR merged ! 🚀
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request Description
What does this PR change?
Improves the avatar bone matrix calculation pipeline with three key optimizations, plus fixes a critical bug causing infinite avatar re-instantiation.
1. TransformAccessArray for parallel bone gathering
Bone
localToWorldMatrixreads are now performed on worker threads viaTransformAccessArray+IJobParallelForTransform(dedicatedBoneGatherJobandAvatarRootGatherJob), instead of iterating every bone per avatar on the main thread. This eliminates a significant main-thread bottleneck that scaled linearly with avatar count — 62Transform.localToWorldMatrixcalls per avatar, each involving a managed-to-native transition.2. Dedicated main player pipeline to unblock InterpolateCharacterSystem
The main player avatar is processed in its own separate small pipeline (62 bones + 1 root) that schedules and completes immediately within
StartAvatarMatricesCalculationSystem. This releases theTransformAccessArraylock on the main player's transforms beforeInterpolateCharacterSystemruns inChangeCharacterPositionGroup, preventing the job system from blocking the main thread when the character controller needs to write to the player transform hierarchy. Remote avatars use a separate batched pipeline with deferred completion inPreRenderingSystemGroup.3. Per-avatar Burst-optimized bone calculation (from PR #7230)
BoneMatrixCalculationJobis now parallelized per avatar rather than per bone. Each job task loads the avatar matrix once and loops over 62 bones usingmath.mulonfloat4x4natively — a tight sequential range that Burst can auto-vectorize. This also eliminatesMatrix4x4↔float4x4casts throughout the pipeline. Additionally,TransformAccessArrays are rebuilt lazily (only when avatars are added/removed), and the register-once pattern (RegisterAvatar/RegisterMainPlayerAvatar) avoids redundant per-frame main-thread work.4. Structural cleanup: pipeline separation into dedicated classes
AvatarTransformMatrixJobWrapperhas been split into three files for clarity:MainPlayerPipeline— Dedicated single-avatar pipeline. Registered once at avatar creation and never released (bones are stable for the lifetime of the game). No dummy transform fallback needed.RemoteAvatarPipeline— Batched pipeline for all remote avatars with dynamic resizing, index recycling, and deferred completion.AvatarTransformMatrixJobWrapper— Thin orchestrator that owns both pipelines and the shared dummy transform.The main player pipeline is now skipped during wearable changes (
ReleaseAvatarno longer tears down the main player registration), since the skeleton and bone transforms persist across wearable swaps.5. In-place TAA slot updates to eliminate rebuild bottleneck
RemoteAvatarPipelinenow uses flat backing arrays (flatBones,flatRoots) pre-filled withdummyTransform, replacing the jaggedTransform[][]+Transform[]arrays. When avatars are registered or released, TAA slots are updated in-place viaTransformAccessArray[index] = transform(O(62) per avatar) instead of triggering a full rebuild of all slots (O(N×62)). Full TAA rebuilds now only occur on capacity growth (rare). This eliminates the main-thread stall that occurred every frame during bulk avatar instantiation — previously, destroying 500 avatars and reinstantiating them caused ~50 consecutive frames of 10-20ms stalls fromhandle.Complete()+ full TAA reconstruction.6. Fix:
Profile.IsDirtynever reset — infinite avatar re-instantiationBug fix:
AvatarLoaderSystemcheckedprofile.IsDirtyto trigger avatar shape updates, but never reset it tofalseafter consuming it. This caused an infinite loop:AvatarLoaderSystemseesprofile.IsDirty == true, setsavatarShapeComponent.IsDirty = trueAvatarInstantiatorSystemre-instantiates the avatar (full material/wearable/skinning setup), setsavatarShapeComponent.IsDirty = falseAvatarLoaderSystemseesprofile.IsDirtyis stilltrue→ goto 1Every avatar that ever received a profile update was being fully re-instantiated every frame indefinitely, gated only by the frame time budget. This masked itself as "normal avatar overhead" but was consuming the entire instantiation budget on redundant work. Fixed by resetting
profile.IsDirty = falseinAvatarLoaderSystemafter consuming it, with shared logic extracted intoApplyProfileToAvatarShape.Files changed:
BoneMatrixCalculationJob.cs—IJobParallelForper-avatar withfloat4x4/math.mulTransformGatherJobs.cs— NewBoneGatherJob+AvatarRootGatherJob(IJobParallelForTransform)AvatarTransformMatrixJobWrapper.cs— Thin orchestrator delegating to the two pipeline classes, passes dummyTransform to RemoteAvatarPipelineMainPlayerPipeline.cs— Dedicated main player bone matrix pipeline (register-once, immediate completion)RemoteAvatarPipeline.cs— Batched remote avatar pipeline with flat backing arrays and in-place TAA updatesAvatarTransformMatrixComponent.cs— AddedIsMainPlayerflag for pipeline routingStartAvatarMatricesCalculationSystem.cs— Split query:PlayerComponent→ main player pipeline, others → remoteFinishAvatarMatricesCalculationSystem.cs— Routes to correct result array based onIsMainPlayerReleaseAvatar.cs— Main player pipeline is never released on wearable changeAvatarInstantiatorSystem.cs— PassesreleaseFromPipeline: falsefor main player wearable changesAvatarLoaderSystem.cs— Resetprofile.IsDirty = falseafter consuming, extractedApplyProfileToAvatarShapePerformance comparison
With 400 avatars, on an isolated worlds, there are clear gains. Left is
dev, right is this PRInterpolate character system doesnt show a bottleneck, as originally presented in this issue. The difference between
devand this branch is meaninglessTest Instructions
Test Steps
AvatarInstantiatorSystemandAvatarLoaderSystemshow negligible cost in the Profiler (no repeated re-instantiation)Additional Testing Notes
StartAvatarMatricesCalculationSystemshould be significantly reducedQuality Checklist
🤖 Generated with Claude Code