Skip to content

⚗ [RUMF-902] implement a new mutation observer #810

Merged
BenoitZugmeyer merged 15 commits intomasterfrom
benoit/reimplement-mutation-processing
Apr 30, 2021
Merged

⚗ [RUMF-902] implement a new mutation observer #810
BenoitZugmeyer merged 15 commits intomasterfrom
benoit/reimplement-mutation-processing

Conversation

@BenoitZugmeyer
Copy link
Member

@BenoitZugmeyer BenoitZugmeyer commented Apr 27, 2021

Motivation

The current mutation observer logic is complicated and inefficient. Let's implement a new mutation observer.

Changes

Testing

Manual, unit/e2e

Perf impact:

Before:

Screenshot 2021-04-28 at 14 36 39

SDK time: about 1900ms

After:

Screenshot 2021-04-28 at 14 21 54

SDK time: about 560ms


I have gone over the contributing documentation.

@BenoitZugmeyer BenoitZugmeyer requested a review from a team as a code owner April 27, 2021 17:50
@BenoitZugmeyer BenoitZugmeyer marked this pull request as draft April 27, 2021 17:50
@BenoitZugmeyer BenoitZugmeyer force-pushed the benoit/reimplement-mutation-processing branch from 159bf2e to 11779f9 Compare April 28, 2021 08:33
@codecov-commenter
Copy link

codecov-commenter commented Apr 28, 2021

Codecov Report

Merging #810 (0d860e0) into master (270cff1) will increase coverage by 0.57%.
The diff coverage is 91.44%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #810      +/-   ##
==========================================
+ Coverage   86.24%   86.81%   +0.57%     
==========================================
  Files          78       80       +2     
  Lines        3737     3869     +132     
  Branches      848      879      +31     
==========================================
+ Hits         3223     3359     +136     
+ Misses        514      510       -4     
Impacted Files Coverage Δ
packages/rum-recorder/src/boot/recorder.ts 100.00% <ø> (ø)
packages/rum-recorder/src/domain/rrweb/types.ts 100.00% <ø> (ø)
packages/rum-recorder/src/domain/rrweb/utils.ts 82.97% <ø> (-1.34%) ⬇️
packages/rum-recorder/src/domain/rrweb/observer.ts 76.86% <48.14%> (-1.77%) ⬇️
packages/rum-recorder/test/utils.ts 83.87% <85.71%> (+4.40%) ⬆️
.../rum-recorder/src/domain/rrweb/mutationObserver.ts 99.09% <99.09%> (ø)
...er/src/domain/rrweb-snapshot/serializationUtils.ts 100.00% <100.00%> (ø)
...rum-recorder/src/domain/rrweb-snapshot/snapshot.ts 82.37% <100.00%> (+0.81%) ⬆️
packages/rum-recorder/src/domain/rrweb/mutation.ts 65.15% <100.00%> (+0.47%) ⬆️
packages/rum-recorder/src/domain/rrweb/record.ts 64.00% <100.00%> (ø)
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 270cff1...0d860e0. Read the comment docs.

Base automatically changed from benoit/prepare-mutation-tests to master April 28, 2021 09:04
This commit introduces a few utility functions to store and retrieve
serialized node ids stored in DOM nodes. The equivalent tools (`mirror`,
`INode`...) that are only used by the 'mutation' module are intentionally
left as-is, and will be removed in a future PR when we'll enable the new
mutation observer.

Before this commit, some 'observer' strategies didn't check if
`mirror.getId` returned a valid `id` before emitting a record. In
practice it shouldn't happen, but in unit tests it did happen because we
artificially emitted 'input' events on a DOM node that wasn't in the
document. This is why some tests were slightly adjusted to reflect the
reality more accurately.
@BenoitZugmeyer BenoitZugmeyer force-pushed the benoit/reimplement-mutation-processing branch from 11779f9 to 3863583 Compare April 28, 2021 09:05
@BenoitZugmeyer BenoitZugmeyer marked this pull request as ready for review April 28, 2021 09:09
@BenoitZugmeyer BenoitZugmeyer force-pushed the benoit/reimplement-mutation-processing branch 2 times, most recently from 83cdea4 to 2630513 Compare April 28, 2021 18:36
@BenoitZugmeyer BenoitZugmeyer force-pushed the benoit/reimplement-mutation-processing branch from 2630513 to 6a232ed Compare April 28, 2021 18:38
We chose to remove this test because its intent is a bit obscure and the
new mutation observer has a good enough test coverage. I kept the test
for the old mutation observer implementation since its coverage is still
lacking. It will be removed when we enable the new mutation observer.
@BenoitZugmeyer BenoitZugmeyer force-pushed the benoit/reimplement-mutation-processing branch from 1404e07 to 581bc9d Compare April 29, 2021 14:57
Copy link
Collaborator

@bcaudan bcaudan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work 🎉

Co-authored-by: Bastien Caudan <bastien.caudan@datadoghq.com>
Copy link
Contributor

@vlad-mh vlad-mh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, super exciting perf wins! let's test it out
some nits, but mostly ideas for further improvements

Comment on lines +58 to +61
document.contains(mutation.target) &&
hasSerializedNode(mutation.target) &&
!nodeOrAncestorsIsIgnored(mutation.target) &&
!nodeOrAncestorsShouldBeHidden(mutation.target)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some optimization thoughts, probably for a later PR:

  • both nodeOrAncestorsIsIgnored and nodeOrAncestorsShouldBeHidden walk the parent hierarchy of mutation.target. Ultimately, we should do it only once.
  • document.contains by definition must be O(N), N being number of dom elements? If we walk the parent hierarchy of mutation.target anyway, we might as well just check the oldest parent is document?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree, this is still something I have in mind: optimizing this loop or using the 'cache' optimization could still be usefull here. In practice, I observed that this whole filter loop takes less than 1ms even when hundreds of mutations occurs.

document.contains is fast. It is not O(N), and the browser probably uses internal tricks to avoid walking on all parents. See for yourself, run this in a busy page:

{ const list = document.querySelectorAll('*'); const start = performance.now(); list.forEach(e => { while (e) { e = e.parentNode };  }); console.log(performance.now() - start) }

{ const list = document.querySelectorAll('*'); const start = performance.now(); list.forEach(e => document.contains(e)); console.log(performance.now() - start) }

if (!addedAndMovedNodes.has(node)) {
removedNodes.set(node, mutation.target)
}
addedAndMovedNodes.delete(node)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm nitpicking at this point, so feel free to ignore but if you processed mutation.removedNodes prior to mutation.addedNodes I think you would, in your second loop, only end up doing one lookup in the set rather than two?

in my mind:

    forEach(mutation.removedNodes, (node) => {
        removedNodes.set(node, mutation.target)
    }
    forEach(mutation.addedNodes, (node) => {
       if (!removedNodes.has(node)) {
        addedAndMovedNodes.set(node, mutation.target)
      }    
    })

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A node cannot be both removed and added in a single mutation. Then, the process order doen't matter.


// Deduplicate mutations based on their target node
const handledNodes = new Set<Node>()
const filteredMutations = mutations.filter((mutation) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we live with just 1 loop here? 🤔

const textMutations: TextMutation[] = []
const handledNodes = new Set<Node>()

for (n in mutations) {
   if (handledNodes.has(mutation.target)) {
       continue;
    }
    handledNodes.add(mutation.target)
    const value = mutation.target.textContent
    if (value === mutation.oldValue) {
        continue
    }
    textMutations.push({
        id: getSerializedNodeId(mutation.target),
        value,
    })

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was my first version for both processCharacterDataMutations and processAttributesMutations. But, for attributes, the loop got a bit hairy... so I split the loop into "deduplication" and "emission". To have a consistent pattern between the two functions, I applied the same structure to character data mutations. I find it a bit easier to read, and it shouldn't matter performance wise.

@BenoitZugmeyer BenoitZugmeyer merged commit bb2e83e into master Apr 30, 2021
@BenoitZugmeyer BenoitZugmeyer deleted the benoit/reimplement-mutation-processing branch April 30, 2021 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants