Skip to content

improvement(id-compressor): Add resetUnfinalizedCreationRange API and use in CR.replayPendingStates#26784

Merged
markfields merged 14 commits intomicrosoft:mainfrom
markfields:id-comp-replaypendingstates
Mar 25, 2026
Merged

improvement(id-compressor): Add resetUnfinalizedCreationRange API and use in CR.replayPendingStates#26784
markfields merged 14 commits intomicrosoft:mainfrom
markfields:id-comp-replaypendingstates

Conversation

@markfields
Copy link
Copy Markdown
Member

@markfields markfields commented Mar 19, 2026

Description

Background

Managing ID Allocation ops/batches in the ContainerRuntime has been a perennial source of difficulty, including some current efforts around Batch tracking, and potentially Staging Mode as well.

One special case that we can simplify is what happens for "resubmit". The current flow submits an ID Allocation op before replaying pending ops. This is divergent from the typical semantics of ID Allocation ops which is that they're only submitted right before an op that uses a new compressed ID.

The new API

So we are introducing IdCompressor.releaseUnfinalizedCreationRange. It's similar to takeUnfinalizedCreationRange, but instead of returning the range, it merely resets the internal state nextRangeBaseGenCount to be before any unfinalized ranges, since we know they won't be finalilzed (since the connection the ID Allocations were sent on closed without those acks).

Now, the next call to takeNextCreationRange will include those unfinalized ranges, and we don't need to interject an extra op before replay.

takeUnfinalizedCreationRange itself becomes a two-line shortcut. Maybe we would want to deprecate it since additionally, it's unused internally?

Reconnect Flow

By the time we get to replayPendingStates, we know everything pending has not been sequenced, and never will be until we resubmit. I.e. we've finalized ALL successfully submitted ranges. So we can safely move nextRangeBaseGenCount back according to the local clusters - those IDs will not be finalized by any in-flight ops, and should be included in the next creation range, triggered whenever the resubmitted main content (or subsequent changes) demands.

Testing

Updated a test that was added for that replayPendingStates codepath to demonstrate that this works (if you comment out the new call, the test fails with Ranges Finalized Out Of Order).



Here's the summary of the design discussion Claude and I had, comparing this ("Approach B") to 2 other alternatives (Approach A was very bad so omitted here):

Approaches for Deferring ID Allocation Op During Replay

Problem

During replayPendingStates, the container runtime calls takeUnfinalizedCreationRange and immediately submits an IdAllocation op. Submitting this op during replay is problematic. We want to defer the allocation so it is included in the next naturally-submitted IdAllocation op instead.

Background

  • takeNextCreationRange returns IDs generated since the last range was taken, starting from an internal cursor (nextRangeBaseGenCount), and advances the cursor forward.
  • takeUnfinalizedCreationRange returns ALL unfinalized IDs (going back to the last finalized cluster), and also advances the cursor forward.
  • generateCompressedId does not interact with the range-taking cursor at all -- it only touches localGenCount, cluster state, and the normalizer.

Approach B: releaseUnfinalizedCreationRange (IdCompressor change)

Add a new void method to IIdCompressorCore / IdCompressor:

public releaseUnfinalizedCreationRange(): void {
    // Reset nextRangeBaseGenCount back to the start of the unfinalized region
}

The container runtime calls this during replay instead of submitting an op. The next takeNextCreationRange call naturally produces a range covering both the old unfinalized IDs and any new IDs generated in the interim.

Pros

  • State tracking is consolidated inside the IdCompressor
  • No merge logic needed -- takeNextCreationRange handles everything
  • Only a single range is ever produced, so no ordering/overlap issues
  • normalizer.getRangesBetween works correctly for the expanded range

Cons

  • New method on IIdCompressorCore (a @legacy @beta public interface)
  • nextRangeBaseGenCount going backward is a new pattern (currently it only advances)
  • The reserved range is opaque -- caller cannot inspect it for logging/debugging

Approach C: Boolean flag in container runtime (no IdCompressor change)

Add a boolean flag (needsUnfinalizedResubmit) to the container runtime. Set it during replay instead of submitting an op. In submitIdAllocationOpIfNeeded, check the flag:

const idRange = this.needsUnfinalizedResubmit
    ? this._idCompressor.takeUnfinalizedCreationRange()
    : this._idCompressor.takeNextCreationRange();
this.needsUnfinalizedResubmit = false;

Pros

  • Zero changes to IdCompressor or its public API
  • Uses existing, well-tested methods (takeUnfinalizedCreationRange)
  • No new invariants -- nextRangeBaseGenCount continues to only advance
  • Simpler to review and reason about

Cons

  • State is split across two components (boolean in container runtime, range state in compressor)
  • Container runtime must know about the distinction between the two take methods (though it already does today)

Recommendation

Both approaches are functionally equivalent and correct. Approach C is the lower-risk path (no API change, uses existing methods). Approach B is the more principled one (compressor owns its own state).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the ID allocation replay/resubmit flow to avoid submitting an extra IdAllocation op up-front during ContainerRuntime.replayPendingStates, by introducing a new IdCompressor.releaseUnfinalizedCreationRange() API that rewinds the next-take cursor to include all unfinalized IDs in the next naturally-submitted allocation range.

Changes:

  • Add IIdCompressorCore.releaseUnfinalizedCreationRange(): void and implement it in IdCompressor.
  • Refactor takeUnfinalizedCreationRange() to delegate to releaseUnfinalizedCreationRange() + takeNextCreationRange().
  • Update runtime/test logic to rely on the released cursor and validate ranges/finalization across reconnect.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
packages/runtime/id-compressor/src/types/idCompressor.ts Adds the new core API contract and docs for releasing unfinalized ranges.
packages/runtime/id-compressor/src/idCompressor.ts Implements cursor rewind via releaseUnfinalizedCreationRange and simplifies takeUnfinalizedCreationRange.
packages/runtime/id-compressor/src/test/idCompressor.spec.ts Adds coverage ensuring released ranges are included on the next takeNextCreationRange.
packages/runtime/id-compressor/replay-id-allocation-approaches.md Adds design notes comparing alternative replay approaches.
packages/runtime/id-compressor/api-report/id-compressor.legacy.beta.api.md Updates API report to include the new method on IIdCompressorCore.
packages/runtime/container-runtime/src/containerRuntime.ts Switches replay behavior to release ranges instead of submitting a dedicated resubmit allocation op.
packages/runtime/container-runtime/src/test/containerRuntime.spec.ts Updates reconnect test to validate IDs from a “lost take” are finalized after reconnect with a single allocation op.

Copy link
Copy Markdown
Contributor

@anthony-murphy anthony-murphy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change seems like a good direction to me, but i don't have deep id-compressor knowledge

@markfields markfields changed the title improvement(id-compressor): Add releaseUnfinalizedCreationRange API and use in CR.replayPendingStates improvement(id-compressor): Add resetUnfinalizedCreationRange API and use in CR.replayPendingStates Mar 25, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🔗 No broken links found! ✅

Your attention to detail is admirable.

linkcheck output


> fluid-framework-docs-site@0.0.0 ci:check-links /home/runner/work/FluidFramework/FluidFramework/docs
> start-server-and-test "npm run serve -- --no-open" 3000 check-links

1: starting server using command "npm run serve -- --no-open"
and when url "[ 'http://127.0.0.1:3000' ]" is responding with HTTP status code 200
running tests using command "npm run check-links"


> fluid-framework-docs-site@0.0.0 serve
> docusaurus serve --no-open

[SUCCESS] Serving "build" directory at: http://localhost:3000/

> fluid-framework-docs-site@0.0.0 check-links
> linkcheck http://localhost:3000 --skip-file skipped-urls.txt

Crawling...

Stats:
  272172 links
    1863 destination URLs
    2108 URLs ignored
       0 warnings
       0 errors


1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

🔗 No broken links found! ✅

Your attention to detail is admirable.

linkcheck output


> fluid-framework-docs-site@0.0.0 ci:check-links /home/runner/work/FluidFramework/FluidFramework/docs
> start-server-and-test "npm run serve -- --no-open" 3000 check-links

1: starting server using command "npm run serve -- --no-open"
and when url "[ 'http://127.0.0.1:3000' ]" is responding with HTTP status code 200
running tests using command "npm run check-links"


> fluid-framework-docs-site@0.0.0 serve
> docusaurus serve --no-open

[SUCCESS] Serving "build" directory at: http://localhost:3000/

> fluid-framework-docs-site@0.0.0 check-links
> linkcheck http://localhost:3000 --skip-file skipped-urls.txt

Crawling...

Stats:
  272172 links
    1863 destination URLs
    2108 URLs ignored
       0 warnings
       0 errors


@taylorsw04
Copy link
Copy Markdown
Contributor

@markfields I thought you were going to deprecate (or, honestly, just remove--I've audited all external users repos) takeUnfinalizedCreationRange? I'd like to not let the surface grow.

@markfields
Copy link
Copy Markdown
Member Author

markfields commented Mar 25, 2026

deprecate (or, honestly, just remove...)

@taylorsw04 - I'm hoping to deprecate this whole IIdCompressorCore interface for moving to @internal in 2.100. We'd need to get the announcement together by Friday.

I'll either do that, or have a 1-liner follow-up PR that deprecates the new thing, and then remove it in 2.100.

@markfields markfields enabled auto-merge (squash) March 25, 2026 20:28
@markfields markfields merged commit b00591c into microsoft:main Mar 25, 2026
34 checks passed
@markfields markfields deleted the id-comp-replaypendingstates branch March 25, 2026 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants