Cross-Product Literal Expansion for Char Classes
Patterns with small char classes in the middle of concatenations (e.g., ag[act]gtaaa|tttac[agt]ct) were routed to UseDFA (pure lazy DFA scan) because the char class broke literal extraction — 120x slower than Rust, 1.3x slower than Go stdlib.
The literal extractor now computes the cross-product through small char classes (≤10 chars), producing full-length discriminating literals for the Teddy SIMD prefilter.
Results (6MB DNA input, AMD EPYC)
| Pattern | Before (v0.12.2) | After (v0.12.3) | Speedup |
|---|---|---|---|
ag[act]gtaaa|tttac[agt]ct |
463ms (UseDFA) | 32ms (UseTeddy) | 14x |
agg[act]taaa|ttta[agt]cct |
455ms (UseDFA) | 33ms (UseTeddy) | 14x |
aggg[acg]aaa|ttt[cgt]ccct |
457ms (UseDFA) | 32ms (UseTeddy) | 14x |
agggt[cgt]aa|tt[acg]accct |
466ms (UseDFA) | 32ms (UseTeddy) | 14x |
All 9 regexdna patterns now use UseTeddy — 10-20x faster than stdlib across the board.
Safety
- Three-layer overflow protection (CrossProductLimit=250, MaxLiterals=64, truncate-to-4-bytes)
- FoldCase guard prevents case-insensitive matching bugs
- DNA correctness: byte-for-byte match parity with Go stdlib across 1KB/64KB/1MB
- Zero regressions on all existing benchmarks
Reported by @kostya via regexdna benchmark.