Commit 5ed05f7
committed
perf: optimize strpos by eliminating double iteration for UTF-8
For non-ASCII strings, the original implementation used string.find()
to get the byte index, then counted characters up to that byte index.
This required two passes through the string.
This optimization uses char_indices() to find the substring while
simultaneously tracking character positions, completing the search
in a single pass.
Benchmark results (UTF-8 strings):
- str_len_8: 188.98 µs → 140.54 µs (25.4% faster)
- str_len_32: 615.69 µs → 294.15 µs (52.2% faster)
- str_len_128: 2.2707 ms → 1.2462 ms (45.1% faster)
- str_len_4096: 74.328 ms → 36.538 ms (50.9% faster)
ASCII performance unchanged (already optimized with fast path).1 parent 7c50448 commit 5ed05f7
1 file changed
+26
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
215 | 215 | | |
216 | 216 | | |
217 | 217 | | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
225 | | - | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
226 | 244 | | |
227 | 245 | | |
228 | 246 | | |
| |||
0 commit comments