♻️ [PANA-5123] Assign node ids in preorder when serializing#4002
Conversation
| const expectedHost = expectNewNode({ type: NodeType.Element, tagName: 'div' }) | ||
| const shadowRootNode = expectNewNode({ type: NodeType.DocumentFragment, isShadowRoot: true }) | ||
| const child = expectNewNode({ type: NodeType.Element, tagName: 'span' }) |
There was a problem hiding this comment.
The helper function expectNewNode() assigns ids to each node in order, so we have to reverse the order of these calls to get ids that match the new approach.
|
|
||
| expect(serializeDocumentNode(document, NodePrivacyLevel.ALLOW, transaction)).toEqual({ | ||
| type: NodeType.Document, | ||
| id: 0, |
There was a problem hiding this comment.
It was necessary to add this because serializeDocumentNode() used to return a node with no id assigned, but now it returns a node with an id. Conveniently, the id of the document node is always zero with pre-order assignment!
| export * from '../types' | ||
|
|
||
| export { serializeNodeWithId } from '../domain/record' | ||
| export { serializeNode, serializeNode as serializeNodeWithId } from '../domain/record' |
There was a problem hiding this comment.
There are some known external callers of this function in Datadog-internal code. I've exported the function under both the old and new names for compatibility, but I plan to update those callers once this change ships, and then we can remove the old name.
|
|
||
| export const enum NodeIdConstants { | ||
| FIRST_ID = 1, | ||
| FIRST_ID = 0, |
There was a problem hiding this comment.
Why it needs to start at 0, serialization requirement?
Is this change backwards compatible?
Are we positive we are not using 0 elsewhere as a sentinel value?
Sorry if these don't make sense, this change seems risky for someone without all the context (like me 😄)
There was a problem hiding this comment.
Why it needs to start at 0, serialization requirement?
It's because in the new data format, the ids can also be interpreted as array indices; it'd be awkward if they didn't start at 0.
Is this change backwards compatible? Are we positive we are not using 0 elsewhere as a sentinel value?
For the data format, yes. For code which might confuse 0 with other values like undefined (since 0 is falsy), there is definitely a potential risk. I've updated the code in both this repo and in Datadog-internal repos to handle this case correctly. (Almost everything already did, fortunately, but there were a few small issues that have been hammered out.)
…wn-when-serializing * main: (26 commits) ♻️ [PANA-5105] Serialize all DOM attribute values as strings (#3999) 🎨 [PANA-5053] Separate DOM and virtual attribute serialization (#3998) 🐛 clear chain after finalize (#4027) 🎨 [PANA-5222] Make CODEOWNERS more accurate for recording code (#4034) ♻️ Replace longTaskRegistry by longTaskContexts (#4013) 👷 Bump staging to staging-51 Add flagEvaluationEndpointBuilder to TransportConfiguration interface. (#4025) 👷 Enable more renovate flags... (#4023) 🐛 fix developer extension packaging (#4024) 👷 add a script to easily create an access token (#4020) 👷 Ensure that renovate do a full install (#4021) [RUM Browser Profiler] stop profiler when session expires (#4011) 👷 Ensure to have tarballs built at install (#4019) 👷: migrate config renovate.json (#4016) 👷 Update dependency vite to v5.4.21 [SECURITY] (#4015) 👷 Configure test apps dependencies (#4014) v6.25.0 (#4012) 👷 Update dependency vite to v5.4.21 [SECURITY] (#4010) 👷 Include test apps in renovate scan (#4009) 👷 restore canary deployment (#4008) ...
Bundles Sizes Evolution
🚀 CPU Performance
🧠 Memory Performance
|
|
✅ Tests 🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage 🔗 Commit SHA: cfee5f0 | Docs | Datadog PR Page | Was this helpful? Give us feedback! |
Motivation
The current DOM serialization algorithm serializes a node's children before assigning it an id. This results in a post-order assignment, like this:
This is a fine solution for the current data format; node ids are recorded explicitly, so we can assign them using whatever scheme we wish. However, the new format assigns node ids implicitly in the order that the nodes appear, which means that parent nodes will naturally be assigned a node id before their descendants. The new format uses a pre-order assignment, which looks like this:
This difference in node id numbering is the biggest source of complexity when comparing the output of the current DOM serialization algorithm with the output of the new one. It'd be ideal if both algorithms used the same ids; this would make translation between their representations straightforward.
The good news is that, because the current data format records node ids explicitly, there's no backwards compatibility issue with simply changing how node ids are assigned. So, we can switch the existing algorithm to use a pre-order assignment, and eliminate this impedance mismatch between the two algorithms.
Changes
This PR:
serializeNodeWithId()toSerializationTransaction#assignId().SerializationTransaction#assignId()to do the work.serializeNodeWithId(), which now does nothing, and replaces calls to it with direct calls toserializeNode().1to0, for better alignment between the old and new algorithms.Test instructions
You can test the changes using the "Live Replay" tab of the browser SDK extension.
Checklist