Skip to content

chore: add concurrent avatar loading budget#7784

Open
dalkia wants to merge 17 commits intodevfrom
chore/avatar-budget
Open

chore: add concurrent avatar loading budget#7784
dalkia wants to merge 17 commits intodevfrom
chore/avatar-budget

Conversation

@dalkia
Copy link
Copy Markdown
Collaborator

@dalkia dalkia commented Mar 24, 2026

Pull Request Description

What does this PR change?

When many avatars appear at once (e.g., teleporting to a crowded area), AvatarLoaderSystem creates WearablePromise for every avatar immediately. Each avatar triggers ~5-7 asset bundle downloads competing for bandwidth, causing slow loading for all avatars.

This PR adds a ConcurrentLoadingPerformanceBudget that limits how many avatars can load wearables concurrently (default: 3). Avatars beyond the limit are deferred until slots free up.

Key changes:

  • AvatarShapeComponent: Added IAcquiredBudget LoadingBudget field and IsWearableInstantiated property to track budget ownership and wearable readiness
  • AvatarLoaderSystem: Budget-gates non-player Profile-based avatar creation and updates. Entities without budget skip AvatarShapeComponent creation and retry each frame. Main player always bypasses budget. Uses existing IAcquiredBudget/AcquiredBudget pattern for safe, idempotent release
  • AvatarInstantiatorSystem: Releases budget via LoadingBudget.Release() after avatar instantiation completes (in InstantiateAvatar)
  • AvatarCleanUpSystem: New DestroyPendingAvatar query releases budget for entities deleted before instantiation (no AvatarBase)
  • AvatarPlugin: New maxConcurrentAvatarLoads setting (default 3), creates ConcurrentLoadingPerformanceBudget and passes it to AvatarLoaderSystem
  • CharacterEmoteSystem: Defers emote downloading/playback until avatar wearables are instantiated (IsWearableInstantiated)
  • AvatarGhostSystem: Uses IsWearableInstantiated instead of raw InstantiatedWearables.Count check

SDK component paths (PBAvatarShape) are not budgeted in this PR — will be handled separately.

Test Instructions

Prerequisites

  • Use a development build with Avatar Debug panel enabled

Test Steps

  1. Enter world and open the Avatar Debug panel
  2. Instantiate 30 random avatars → all should appear (loading 3 at a time)
  3. Instantiate 300 random avatars → avatars should load progressively in batches, not all at once
  4. Verify the main player avatar always loads immediately regardless of budget
  5. Destroy all avatars → budget should return to full (no leaks)
  6. Instantiate avatars, then destroy random subsets → remaining avatars should continue loading
  7. Check the auth screen character preview → should work without errors (uses NoAcquiredBudget)

Additional Testing Notes

  • Verify no "Tried to release more budget than the max budget allows" exceptions
  • Test profile updates on remote avatars while budget is exhausted — updates should defer and retry
  • Emotes should not play on avatars that haven't finished loading wearables

Quality Checklist

  • Changes have been tested locally
  • Documentation has been updated (if required)
  • Performance impact has been considered
  • For SDK features: Test scene is included

@dalkia dalkia requested review from a team as code owners March 24, 2026 19:02
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 24, 2026

@dalkia dalkia changed the base branch from dev to chore/opti-transform-job March 24, 2026 19:03
Base automatically changed from chore/opti-transform-job to dev March 25, 2026 17:51
Copy link
Copy Markdown

@decentraland-bot decentraland-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Well-structured PR that introduces a concurrent loading budget for avatar wearables and splits the bone matrix pipeline into dedicated main-player and remote-avatar tracks. The architecture is clean, the budget lifecycle (acquire → load → release on instantiate/cleanup) is consistent across all code paths, and the test coverage is solid.

Highlights

  • Budget-gated loading is well integrated — main player always bypasses, SDK component and Profile paths both defer correctly, and cleanup/delete paths properly release the budget.
  • Pipeline split (MainPlayerPipeline / RemoteAvatarPipeline) is a smart move. Completing the main player immediately avoids TransformAccessArray lock contention with InterpolateCharacterSystem.
  • WearableLoadingState wrapper is a clear improvement over relying on IsConsumed / default checks on the raw promise. The explicit None → Loading → Consumed state machine makes the flow much easier to reason about.
  • Fallback try/catch in AvatarInstantiatorSystem for failed instantiation with retry using a fallback body shape is a nice resilience improvement.
  • TestsAvatarLoadingBudgetShould covers the key scenarios: budget exhaustion, deferred loading, release-on-delete, main player bypass, and re-acquisition on profile update.

Items to discuss

  1. Default budget mismatch — The PR description says the default concurrent limit is 3, but AvatarShapeSettings.maxConcurrentAvatarLoads is set to 5. Which value is intended for production?

  2. AvatarRootGatherJob — full matrix inversemath.inverse((float4x4)transform.localToWorldMatrix) computes a general 4×4 inverse. Since avatar root transforms are rigid (uniform scale), math.fastinverse would be cheaper (transpose of rotation + negated translation). TransformAccess doesn't expose worldToLocalMatrix directly, so this is the only option — but fastinverse would be a low-risk perf win for the remote batch where it runs per-slot including dummy slots.

  3. High-water-mark schedulingRemoteAvatarPipeline.Schedule(batchCount) schedules avatarIndex tasks (the high-water mark), not the count of active avatars. Released slots are skipped via UpdateAvatar[idx] = false, but after heavy churn (many avatars created then destroyed), the job iterates over all ghost slots. Consider compacting avatarIndex when releasedIndexes.Count grows large, or tracking an active count.

  4. WearableLoadingState is a heap-allocated class inside a struct — Every AvatarShapeComponent allocates a new WearableLoadingState(). This is clearly intentional for reference semantics (marked readonly), but worth a brief doc comment noting why it's a class — future maintainers might be tempted to convert it to a struct.

  5. BoneMatrixCalculationJob parallelism change — Previously each bone was a parallel task; now each avatar is a task with an inner bone loop. With the batch count of 4, this means up to 4 avatars process per worker thread chunk. The Burst auto-vectorization of the inner loop should compensate, but if the remote avatar count stays low (≤5), this effectively serializes the bone calculation. Not a problem in practice given the budget limit, just noting the tradeoff.

  6. Minor: [NativeDisableParallelForRestriction] on bonesMatricesResult — Added because each avatar task now writes to a range [offset..offset+boneCount). The ranges are non-overlapping so this is safe, but a brief comment explaining why the restriction is disabled would help reviewers.

Overall this looks good. The budget lifecycle is consistent, cleanup paths are covered, and the pipeline split is well-motivated. No blocking issues found.

Requested by Juan Ignacio Molteni via Slack


public void Execute(int index, TransformAccess transform)
{
MatrixFromAllAvatars[index] = math.inverse((float4x4)transform.localToWorldMatrix);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perf suggestion: Since avatar root transforms are rigid bodies (rotation + translation, uniform scale), math.fastinverse would be significantly cheaper than a full math.inverse here. fastinverse assumes an orthonormal rotation matrix and simply transposes the 3×3 + negates the translation — no cofactor expansion needed.

MatrixFromAllAvatars[index] = math.fastinverse((float4x4)transform.localToWorldMatrix);

This runs for every slot (including dummies), so the savings accumulate.

/// <summary>
/// Wraps a <see cref="WearablePromise"/> with an explicit loading status
/// so callers never rely on <c>default</c> or <c>IsConsumed</c> to decide
/// whether loading has been requested.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a brief doc comment explaining why this is a class rather than a struct — e.g. "Reference type so that copies of the enclosing AvatarShapeComponent struct share the same loading state instance." This will prevent future contributors from accidentally converting it.

[field: SerializeField]
public int defaultMaterialCapacity = 100;

[field: SerializeField]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says the default is 3, but this is set to 5. Which is the intended production default? Might be worth aligning the description or adding a comment explaining the choice.

{
private readonly int boneCount;

[NativeDisableParallelForRestriction]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: worth a one-liner explaining why the restriction is disabled — e.g. "Each avatar task writes to a non-overlapping [avatarIdx * boneCount .. (avatarIdx+1) * boneCount) range."

Copy link
Copy Markdown
Collaborator

@mikhail-dcl mikhail-dcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will review the part with matrices calculations separately

HiddenByModifierArea = false;
IsPreview = false;
ShowOnlyWearables = showOnlyWearables;
this.LoadingBudget = loadingBudget;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You always pass NoBudget to the ctor, so maybe you just should assign it here straight away without passing an argument?

/// so callers never rely on <c>default</c> or <c>IsConsumed</c> to decide
/// whether loading has been requested.
/// </summary>
public class WearableLoadingState
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the best idea to store the class ref in the structure due to the memory locality (and allocations of course), I checked a few usages and you have a ref to the original component when you call methods of this class, so I wonder what the real necessity is for having a class?

if (avatarShapeComponent.WearableLoading.Status != AvatarShapeComponent.WearableLoadingStatus.None)
return;

if (!avatarLoadingBudget.TrySpendBudget(out IAcquiredBudget acquiredBudget))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you ignore PartitionComponent on spending budget, you are at a risk of resolving avatars further away first

if (!ReadyToInstantiateNewAvatar(ref avatarShapeComponent)) return;

if (!avatarShapeComponent.WearablePromise.SafeTryConsume(World, GetReportCategory(), out WearablesLoadResult wearablesResult)) return;
if (!avatarShapeComponent.WearableLoading.SafeTryConsume(World, GetReportData(), out WearablesLoadResult wearablesResult)) return;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before SafeTryConsume was here because of the unrelated concurrency issue, SafeTryConsume is a workaround that shouldn't exist. Considering you are already doing the changes and moved this method to the new class, please check if it can be simply replaced with TryConsume which is not a workaround

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants