Document BFS dominator tree approximation bug and add regression test by pyricau · Pull Request #2800 · square/leakcanary

pyricau · 2026-02-26T11:31:01Z

Summary

Fixes #2715 — retained sizes reported by LeakCanary could be significantly over-attributed to the wrong ancestor due to a known approximation bug in the incremental BFS+LCA dominator tree algorithm.

Root cause

DominatorTree builds dominators incrementally during BFS using a Lowest Common Ancestor approach. When a cross-edge (to an already-visited node) is processed, the parent's dominator may still be stale — it may later be raised by another cross-edge at the same BFS level. By the time the parent's dominator is corrected, the child's dominator is never revisited, leaving it too specific. The consequence is that the child's retained size is incorrectly attributed to an ancestor that doesn't actually exclusively dominate it.

Fix

Adds a convergence loop to DominatorTree: stores cross-edges during BFS, then re-processes them with updated dominator values after the BFS completes, iterating until the tree stabilizes (typically 2–3 passes)
Cross-edge storage is gated behind collectCrossEdges = true (opt-in, no overhead for callers that don't call runConvergenceLoop)
Filters at insertion time skip cross-edges where either endpoint is already attributed to the virtual root (NULL_REFERENCE), since re-processing them in the convergence loop would always be a no-op
Settled edges are pruned at the start of each convergence pass so later passes iterate fewer entries

Data structure improvement

Cross-edges were previously stored as MutableList<LongArray> (~40 bytes/entry with object header overhead and indirection). Replaced with LongPairList: a flat LongArray where each pair occupies two consecutive slots (~16 bytes/entry, ~2.5× memory reduction, better cache locality). Pruned entries are marked with a 0L sentinel rather than shrinking the array.

Production wiring

Enabled the convergence loop in all three production callers:

PrioritizingShortestPathFinder (used by LeakCanary leak tracing and ObjectDominators)
ObjectGrowthDetector

Tests

known bug - BFS ordering leaves child dominator too specific: asserts the current (incorrect) behavior before the fix, documenting the approximation
convergence loop fixes stale dominator attribution: asserts the same graph produces the correct result after runConvergenceLoop()
convergence loop stops at maxIterations: verifies the loop respects the iteration cap

Test plan

./gradlew :shark:shark:test passes
./gradlew :shark:shark:apiCheck passes

🤖 Generated with Claude Code

The incremental LCA-during-BFS approach produces incorrect immediate dominators when same-level cross-edges cause a node's dominator to be raised after its children have already been discovered. Those children keep stale dominator pointers set too high, so retained sizes get attributed to the wrong ancestor. Add class- and method-level KDoc naming the limitation concretely, with an ASCII graph of the minimal failing case. Add a passing test that asserts the current (wrong) behavior so the bug is visible and any future fix is automatically validated. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The BFS+LCA incremental algorithm can leave a node's immediate dominator too specific (too far from the virtual root) when a cross-edge P→C is processed while dom(P) is still stale, and dom(P) is later raised by another same-level cross-edge after C's LCA was already settled. This introduces: - `collectCrossEdges = true` constructor flag on DominatorTree, which records each cross-edge (already-visited target) during BFS - `runConvergenceLoop(maxIterations)` that re-runs LCA on stored cross-edges until dominated[] stabilizes (typically 2–3 passes) Documentation and tests are also updated: - Fixes the existing test whose graph analysis was incorrect (the old example had dom(C)=root as the *correct* answer, not a bug); replaces it with a graph where the bug genuinely manifests (dom left too specific) - Adds a test asserting that runConvergenceLoop fixes the attribution - Adds a test verifying maxIterations=0 leaves the stale result intact Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Rename N to ObjectId for clarity - Replace asRoot() with forestRoot > node, unifying all edges under the same > operator (forestRoot wraps NULL_REFERENCE) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Three optimizations that eliminate redundant cross-edges: 1. Don't record a cross-edge if dom(objectId) is already NULL_REFERENCE at insertion time — the convergence loop would always skip it. 2. Don't record a cross-edge if dom(parentObjectId) is NULL_REFERENCE at insertion time — the LCA walk terminates in one step and produces the same result already written by the updateDominated call, so re-processing in the loop is a no-op. 3. After each convergence pass that produced changes, prune edges whose object has reached NULL_REFERENCE. This shrinks the list for subsequent passes, which typically converge in 2–3 iterations. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Cross-edges are recorded before the LCA result is stored in updateDominated, so the LCA can set dom(objectId)=NULL_REFERENCE after the edge is already in the list. These edges are inert (the convergence loop would always skip them) but weren't cleaned up until a later pass with changed=true. Add an initial pruneSettled() call at the top of runConvergenceLoop to handle this, and factor out the shared removeAll predicate to avoid duplication. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The existing NULL_REFERENCE guard in the loop already skips settled edges in O(1), so eagerly reclaiming that memory adds complexity with no practical benefit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Pruning before each pass rather than after means: - Edges already settled after the BFS traversal (when updateDominated's own LCA set dom(objectId)=NULL_REFERENCE after the edge was recorded) are cleaned up before the very first pass. - Edges settled during pass N are removed before pass N+1, reducing the iteration count for all subsequent passes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

MutableList<LongArray> stores each edge as a separate heap object (~32 bytes header + data + pointer), with poor cache locality. Replace it with a single flat LongArray where consecutive pairs (objectId, parentObjectId) occupy indices [i*2, i*2+1]. CrossEdgeBuffer: - add(): appends a pair; doubles the array when full - prune(): marks settled entries in-place using NULL_REFERENCE (0L) as a sentinel — safe since heap object IDs are always > 0; array never shrinks - forEach(): inline higher-order function that skips marked entries, allowing the convergence loop body to modify captured vars (e.g. `changed`) without boxing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Move the flat long-pair buffer to its own file as LongPairList, with no knowledge of dominators or NULL_REFERENCE semantics - Replace prune(dominated) with clearAt(index) + forEachIndexed, keeping the data structure generic (0L as cleared-entry sentinel is documented as a caller constraint) - Move pruning logic into DominatorTree.pruneSettledCrossEdges(), which owns the dominator-specific slot checks Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Fixes #2715 The convergence loop added to DominatorTree was not wired up anywhere, so retained sizes computed by LeakCanary, ObjectDominators, and ObjectGrowthDetector were still subject to the stale-dominator bug where cross-edges processed with an out-of-date dom(parent) leave child dominators too specific, causing over-attribution of retained sizes. Enable collectCrossEdges = true in PrioritizingShortestPathFinder and ObjectGrowthDetector, and call runConvergenceLoop() before computeRetainedSizes() / buildFullDominatorTree() in all three callers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

pyricau · 2026-02-26T20:35:35Z

shark/shark/src/main/java/shark/internal/LongPairList.kt

+      data = data.copyOf(data.size * 2)
+    }
+    data[size * 2] = first
+    data[size * 2 + 1] = second


Move size*2 to a local var to avoid doing it 3 times

This reverts commit c8ab0e4.

@param

Benchmarks on gcroot_unknown_object.hprof (25 MB) revealed that the convergence loop is not viable for production use: - 107,692 cross-edges stored after BFS - 781 iterations to converge (vs the assumed "2-3") - 62 s loop time on top of a 1.5 s analysis (~40x overhead) The O(cross-edges × depth × iterations) complexity means correction chains propagate one hop per iteration, so a heap with deep object graphs requires as many iterations as the longest stale-dominator chain. Update class-level and @param KDoc in DominatorTree to document the performance warning. Add ConvergenceLoopBenchmark to measure the overhead on a real heap dump (without loop vs with loop, 4 timed runs). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

pyricau and others added 13 commits February 26, 2026 12:30

Fix detekt NoMultipleSpaces in DominatorTreeTest

9a5af45

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Refine dominator tree test DSL

591991c

- Rename N to ObjectId for clarity - Replace asRoot() with forestRoot > node, unifying all edges under the same > operator (forestRoot wraps NULL_REFERENCE) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Fix NoMultipleSpaces detekt violations in DominatorTreeTest

7e36911

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Remove cross-edge pruning from convergence loop

5fc1bb0

The existing NULL_REFERENCE guard in the loop already skips settled edges in O(1), so eagerly reclaiming that memory adds complexity with no practical benefit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Update API dump for DominatorTree convergence loop additions

3b6dc16

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

pyricau commented Feb 26, 2026

View reviewed changes

pyricau and others added 2 commits February 26, 2026 21:52

Revert "Enable DominatorTree convergence loop in all production callers"

2d53d34

This reverts commit c8ab0e4.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document BFS dominator tree approximation bug and add regression test#2800

Document BFS dominator tree approximation bug and add regression test#2800
pyricau wants to merge 15 commits intomainfrom
worktree-treemap-heapdump

pyricau commented Feb 26, 2026 •

edited

Loading

Uh oh!

pyricau Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pyricau commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Fix

Data structure improvement

Production wiring

Tests

Test plan

Uh oh!

pyricau Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pyricau commented Feb 26, 2026 •

edited

Loading