Performance idea: add ascii skip char counting happy path. Worth the complexity?#118
Merged
ef4 merged 2 commits intoembroider-build:mainfrom Mar 17, 2026
Merged
Conversation
4450f2f to
c71c45b
Compare
Collaborator
|
Thanks, this seems good to me. Not much extra complexity (and I'm going to land a followup that introduces a real constructor for LocateContentTagVisitor which will make it even clearer). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hey! I was profiling
parse()and noticed that for ASCII-only source files (which most.gtsfiles are), theRangecomputation does.chars().count()and.encode_utf16().count()scans even though byte/char/utf16 offsets are guaranteed to be identical for ASCII text.This PR adds a simple
src.is_ascii()check before the AST walk and skips those scans when true. Not sure if the added branch is worth the complexity, but the numbers seem meaningful — especially ifparse()is called on a hot path (e.g. per-keystroke in a language server).The change is small: one
boolfield on the visitor, one branch inRange::new. Non-ASCII files fall through to the existing logic unchanged.Benchmark (native, release profile, Apple Silicon)
All three tables use the same 208-char base component as their first row, varying one dimension at a time.
By template count (same component repeated N times)
By template content size (same component, extra rows inside the template)
By JS code before template (same component, extra JS lines prepended)
The multi-template case shows the biggest win (3.6x at 20 templates) because the old code re-scans from byte 0 for every range offset. For single-template files, the improvement is a steady ~15-20% regardless of file size. Most of the absolute time is SWC parsing, which is unaffected by this change.
Non-ASCII files are unaffected by this change.
Test plan
benches/parse_bench.rsfor reproducibilityCowritten by claude