Skip to content

perf: use Uint8Array bitmaps for faster tag ID lookups#1679

Open
TrevorBurnham wants to merge 1 commit intoinikulin:masterfrom
TrevorBurnham:bitmap-optimizations
Open

perf: use Uint8Array bitmaps for faster tag ID lookups#1679
TrevorBurnham wants to merge 1 commit intoinikulin:masterfrom
TrevorBurnham:bitmap-optimizations

Conversation

@TrevorBurnham
Copy link

This PR replaces Set.has() lookups with Uint8Array bitmap lookups for tag ID membership checks in hot paths. Bitmap array indexing is ~3-4x faster than Set.has() for small integer keys, and TAG_ID values (0-120) are ideal candidates for this optimization.

This PR also adds benchmarks for SVG parsing. The effect of this change on those benchmarks is minimal, but they should be useful for preventing future performance regressions and testing other potential optimizations.

Changes

packages/parse5/lib/parser/open-element-stack.ts

  • Replaced Set<$> with Uint8Array bitmaps for scope boundary checking:
    • SCOPING_ELEMENTS_HTML_BITMAP
    • SCOPING_ELEMENTS_HTML_LIST_BITMAP
    • SCOPING_ELEMENTS_HTML_BUTTON_BITMAP
    • SCOPING_ELEMENTS_MATHML_BITMAP
    • SCOPING_ELEMENTS_SVG_BITMAP
  • Updated hasInDynamicScope() to use bitmap lookups instead of Set.has()

packages/parse5/lib/common/foreign-content.ts

  • Replaced EXITS_FOREIGN_CONTENT Set with EXITS_FOREIGN_CONTENT_BITMAP Uint8Array
  • Updated causesExit() to use bitmap lookup

bench/perf/svg-benchmark.js (new)

  • Added comprehensive SVG parsing benchmark to test foreign content handling performance
  • Tests various SVG patterns: simple shapes, paths, nested groups, attribute-heavy elements
  • Compares SVG parsing performance against equivalent HTML complexity

Micro-benchmark Results

Set.has vs Bitmap lookup for tag ID membership:
  Set.has:   45.45ms
  Bitmap[]:  13.25ms
  Winner: Bitmap[] (3.43x faster)

End-to-End Benchmark Results

The bitmap optimization shows neutral to slightly positive results in end-to-end benchmarks (within margin of error).

Why This Approach

  1. Zero runtime cost: Bitmaps are created once at module load time
  2. Cache-friendly: Uint8Array is compact and fits in L1 cache
  3. Simple indexing: bitmap[tagId] vs set.has(tagId) - direct array access vs hash lookup
  4. No API changes: Internal optimization only, no breaking changes

Replace Set.has() with Uint8Array bitmap lookups for scope checking
and foreign content exit detection. Bitmap lookup is ~3-4x faster
than Set.has() for small integer keys like TAG_IDs.

Changes:
- open-element-stack.ts: Use bitmaps for scope boundary checking
- foreign-content.ts: Use bitmap for EXITS_FOREIGN_CONTENT check

Also adds SVG-specific benchmark for testing foreign content parsing
performance.
Copy link
Collaborator

@43081j 43081j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense to me! 👍

will wait for @fb55 to review too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants