Skip to content

Conversation

@agarwal-navin
Copy link
Contributor

@agarwal-navin agarwal-navin commented Nov 14, 2025

The summaries written by Shared Tree and its summarizables are not versioned. Any changes to the summary format can lead to failures in clients that do not understand this new format. For example, the incremental summarization feature changes the forest's summary format by adding more nodes to its summary tree which older clients won't be able to read. Making this change behind a new version makes it safer as we can fail fast on seeing unrecognizable versions.
This change adds versioning by adding a metadata blob to the summary tree of the shared tree and its summarizables. This metadata blob will contain the format version of the summary. Every time, the format of the summary changes, a new version should be added so that clients that don't understand this format will fail.

The metadata blob will be written when minVersionForCollab is greater than the next release version 2.73.0.

Since the metadata blob and hence the versions did not exist before 2.73.0, changes to the summary format will not result in failing them during reading from the snapshot. Instead, the faiilures may manifest later when the underlying data is read. So, the recommended way to change format is to do it in a way that will break as soon as possible. For example, chaging the id of the summary node where the summary is stored will result in older clients not finding the snapshot.

AB#53723

@agarwal-navin agarwal-navin requested a review from a team as a code owner November 14, 2025 19:05
@github-actions github-actions bot added area: dds Issues related to distributed data structures area: dds: tree base: main PRs targeted against main branch labels Nov 14, 2025
@yann-achard-MS
Copy link
Contributor

The PR description currently says:

The metadata blob will be written when minVersionForCollab is greater than the next release version 2.73.0.
but the code seems to write the metadata when minVersionForCollab is greater or equal to version 2.73.0.

Since 2.73.0 has not yet been released, it seems fine to write the metadata when when minVersionForCollab is greater or equal to version 2.73.0, so I suggest updating the PR description.

function minVersionToDetachedFieldIndexSummaryVersion(
version: MinimumVersionForCollab,
): DetachedFieldIndexSummaryVersion {
return version < FluidClientVersion.v2_73
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment on the declaration of FluidClientVersion.v2_73 in src\codec\codec.ts that explains the format implications of using v2_73.

@yann-achard-MS
Copy link
Contributor

If I understand correctly, this PR ensures that future changes to the summary tree structures (so long as they update the version numbers written to the relevant metadata summary blobs) will make FF clients >= v2_73 fail fast when they do not know that version. This doesn't solve the issue of FF clients < v2_73, which is why the PR description suggests that future changes be made in a way that breaks them too.

My question is: why not proactively make such a change (either in this PR or as an immediate follow-up)?

Comment on lines +41 to +62
/**
* The versions for the detached field index summary.
*/
export const DetachedFieldIndexSummaryVersion = {
/**
* Version 0 represents summaries before versioning was added. This version is not written.
* It is only used to avoid undefined checks.
*/
v0: 0,
/**
* Version 1 adds metadata to the detached field index summary.
*/
v1: 1,
/**
* The latest version of the detached field index summary. Must be updated when a new version is added.
*/
vLatest: 1,
} as const;
export type DetachedFieldIndexSummaryVersion = Brand<
(typeof DetachedFieldIndexSummaryVersion)[keyof typeof DetachedFieldIndexSummaryVersion],
"DetachedFieldIndexSummaryVersion"
>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a worse version of what could be an enum.

You have type branding, but the actual values like DetachedFieldIndexSummaryVersion.v1 are not branded so you can't use them as a DetachedFieldIndexSummaryVersion.

Is there a reason to not just use an enum here?

If you really want to not use enums and use branding, then you should define a branded type, then brand the constants as members of that type (for example using the newish brandConst utility).

/**
* The storage key for the blob containing metadata for the detached field index's summary.
*/
export const detachedFieldIndexMetadataKey = ".metadata";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This starting with a "." is interesting.

If this is indented to mean something, maybe establish a pattern that "." keys are part of the top level summary not the summary items or something, it should be documented somewhere central, and this should link to it.

* The type for the metadata in the detached field index's summary.
* Using type definition instead of interface to make this compatible with JsonCompatible.
*/
// eslint-disable-next-line @typescript-eslint/consistent-type-definitions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not follow our linter's recommendations? Did you read the documentation for that rule and decide why its suggestion should apply to our codebase, but not to this specific instance? If so, please include that reason as a comment here. If not, go read it now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I'm pretty sure I know what it recommends and I agree with it here for a better developer experience, but I should have to know that from memory to review this: the linter is supposed to help tell people things when relevant.

Note 2: its way easier to get to that rule documentation from in the IDE when looking at the warning (click the link in the warning IIRC) than for a reviewer here. That's another reason to explain why this suppression is ok despite our choice to enable this lint inline here.

>;

/**
* The type for the metadata in the detached field index's summary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably link the key for this.

version: MinimumVersionForCollab,
): DetachedFieldIndexSummaryVersion {
return version < FluidClientVersion.v2_73
? brand(DetachedFieldIndexSummaryVersion.v0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fact we have to brand our own constants to their intended type is removing a lot of the value of having named constants and a branded type. See my other comments about fixing this.

services,
parse,
);
assert(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure an "assert" is the right error type for this. When encountering a document too new for the current code, that indicates a usage error by the application team (either bad rollout schedule, bad min version for collab) or a decision to simply not support this client.

In none of those cases is this error a bug in the fluid framework code, which is what asserts indicate.

If there was some upstream checking which was supposed to ensure such a case would never make it to this point, then an assert here would be valid. If that is the case, can you add a comment explaining what check would catch this and give a usage effort before we get here? If not, replace this with a usage error with useful customer facing details (maybe we should have a shared utility for throwing such forwards compat errors?)

services,
parse,
);
assert(metadata.version === ForestSummaryVersion.v1, "Unsupported forest summary");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Load the incremental summary builder so that it can download any incremental chunks in the
// snapshot.
await this.incrementalSummaryBuilder.load(services, readAndParseBlob);
await this.incrementalSummaryBuilder.load(services, async (id: string) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a lot of "id"s in this code. Could you use a more specific name or type? Just looking at this code I don't even know what this async function is (can't see the parameter name for it from the call site).

It seems a bit odd to be adding this async function as a parameter here in this PR: is this part of the versioning logic somehow?

services,
parse,
);
assert(metadata.version === SchemaSummaryVersion.v1, "Unsupported schema summary");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we have several copies of this logic, maybe we can deduplicate them?

perhaps a "VersionedSummarizier" base-class which implements Summarizable would be a good approach?

I don't mind using base classes for cases like this where is all in one package and not exported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: dds: tree area: dds Issues related to distributed data structures base: main PRs targeted against main branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants