Skip to content

perf: add fast path for strings without ANSI codes#54

Merged
sindresorhus merged 3 commits intochalk:mainfrom
privatenumber:perf/state-machine
Feb 26, 2026
Merged

perf: add fast path for strings without ANSI codes#54
sindresorhus merged 3 commits intochalk:mainfrom
privatenumber:perf/state-machine

Conversation

@privatenumber
Copy link
Contributor

Problem

stripAnsi always runs a regex replacement, even when the input has no ANSI escape codes. This is the common case — most strings passed through strip-ansi are already plain text (e.g. checking/sanitizing user input, processing log lines that are mostly text).

Changes

Adds a string.includes('\x1B') guard before the regex. Since all ANSI escape sequences start with ESC (0x1B), its absence means no ANSI codes exist and we can return the string immediately — skipping regex compilation and allocation of a new string.

Benchmarks

cpu: Apple M2 Max, runtime: node 25.2.1 (arm64-darwin)

Input Before After Speedup
"Hello, World!" (no ANSI) 25.49 ns 1.90 ns 13x
"a" × 1000 (no ANSI) 453.09 ns 9.73 ns 46x
Short ANSI (\x1B[31mHello\x1B[39m) 59.11 ns 63.31 ns ~1x
Medium ANSI (nested SGR) 133.88 ns 144.28 ns ~1x
Heavy ANSI (100-char body) 182.07 ns 196.88 ns ~1x
OSC hyperlink 56.09 ns 60.02 ns ~1x
100 ANSI segments 3.52 µs 5.10 µs ~1x

Non-ANSI input is 13–46x faster. ANSI input has negligible overhead from the includes check (~5–10 ns).

Copy link
Member

@sindresorhus sindresorhus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

I want to see the outcome of nodejs/node#61833 before merging this though.

@privatenumber
Copy link
Contributor Author

nodejs/node#61833 was merged

@Qix-
Copy link
Contributor

Qix- commented Feb 26, 2026

Can you include some benchmarks of large input, though? Like several megabytes. There are cases where people are running this on large log output from e.g. PTYs that have captured ANSI escapes. I worry they'll be hit with a performance regression by the introduction of two linear time scans, though given includes is native code it could be negligible. I just want to make sure.

@privatenumber
Copy link
Contributor Author

privatenumber commented Feb 26, 2026

Benchmarked up to 5 MB with three scenarios on Apple M2 Max / Node 25.2.1 using mitata:

Colored log output (ANSI throughout — includes finds ESC in the first byte):

Size Baseline With guard
1 KB 2.77 µs 1.54 µs
10 KB 17.60 µs 17.69 µs
100 KB 203.66 µs 143.84 µs
1 MB 1.64 ms 1.47 ms
5 MB 9.84 ms 8.59 ms

No regression. The includes check is O(1) in practice here since ESC appears in the first few bytes of colored output.

Plain log output (no ANSI — fast path):

Size Baseline With guard Speedup
1 KB 653.38 ns 42.39 ns 15x
10 KB 7.67 µs 433.24 ns 18x
100 KB 92.15 µs 4.43 µs 21x
1 MB 376.75 µs 54.74 µs 7x
5 MB 2.13 ms 269.57 µs 8x

Worst case: ANSI only at end (includes scans full string before finding ESC, then regex scans again):

Size Baseline With guard
1 KB 685.01 ns 573.70 ns
10 KB 5.49 µs 6.57 µs
100 KB 84.32 µs 56.26 µs
1 MB 592.69 µs 507.05 µs
5 MB 3.59 ms 2.49 ms

Even in the pathological worst case (5 MB of plain text with a single ANSI code at the very end), the double scan shows no measurable regression — includes operates on raw bytes and is negligible relative to the regex replacement cost.

@Qix-
Copy link
Contributor

Qix- commented Feb 26, 2026

LGTM then. :) Thank you for doing that.

@sindresorhus sindresorhus merged commit d67a5b3 into chalk:main Feb 26, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants