Commit 636d57b
[ruby/strscan] Micro optimize encoding checks
(ruby/strscan#117)
Profiling shows a lot of time spent in various encoding check functions.
I'm working on optimizing them on the Ruby side, but if we assume most
strings are one of the simple 3 encodings, we can skip a lot of
overhead.
```ruby
require 'strscan'
require 'benchmark/ips'
source = 10_000.times.map { rand(9999999).to_s }.join(",").force_encoding(Encoding::UTF_8).freeze
def scan_to_i(source)
scanner = StringScanner.new(source)
while number = scanner.scan(/\d+/)
number.to_i
scanner.skip(",")
end
end
def scan_integer(source)
scanner = StringScanner.new(source)
while scanner.scan_integer
scanner.skip(",")
end
end
Benchmark.ips do |x|
x.report("scan.to_i") { scan_to_i(source) }
x.report("scan_integer") { scan_integer(source) }
x.compare!
end
```
Before:
```
ruby 3.3.4 (2024-07-09 revision ruby/strscan@be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
scan.to_i 93.000 i/100ms
scan_integer 232.000 i/100ms
Calculating -------------------------------------
scan.to_i 933.191 (± 0.2%) i/s (1.07 ms/i) - 4.743k in 5.082597s
scan_integer 2.326k (± 0.8%) i/s (429.99 μs/i) - 11.832k in 5.087974s
Comparison:
scan_integer: 2325.6 i/s
scan.to_i: 933.2 i/s - 2.49x slower
```
After:
```
ruby 3.3.4 (2024-07-09 revision ruby/strscan@be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
scan.to_i 96.000 i/100ms
scan_integer 274.000 i/100ms
Calculating -------------------------------------
scan.to_i 969.489 (± 0.2%) i/s (1.03 ms/i) - 4.896k in 5.050114s
scan_integer 2.756k (± 0.1%) i/s (362.88 μs/i) - 13.974k in 5.070837s
Comparison:
scan_integer: 2755.8 i/s
scan.to_i: 969.5 i/s - 2.84x slower
```
ruby/strscan@c02b1ce6841 parent 79cc3d2 commit 636d57b
1 file changed
+38
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
| 36 | + | |
35 | 37 | | |
36 | 38 | | |
37 | 39 | | |
| |||
683 | 685 | | |
684 | 686 | | |
685 | 687 | | |
| 688 | + | |
| 689 | + | |
| 690 | + | |
| 691 | + | |
| 692 | + | |
| 693 | + | |
| 694 | + | |
| 695 | + | |
686 | 696 | | |
687 | 697 | | |
688 | 698 | | |
| |||
710 | 720 | | |
711 | 721 | | |
712 | 722 | | |
713 | | - | |
714 | 723 | | |
| 724 | + | |
715 | 725 | | |
716 | 726 | | |
717 | 727 | | |
718 | 728 | | |
| 729 | + | |
| 730 | + | |
719 | 731 | | |
720 | 732 | | |
721 | 733 | | |
722 | 734 | | |
723 | 735 | | |
724 | 736 | | |
| 737 | + | |
725 | 738 | | |
726 | 739 | | |
727 | 740 | | |
| |||
1282 | 1295 | | |
1283 | 1296 | | |
1284 | 1297 | | |
| 1298 | + | |
| 1299 | + | |
| 1300 | + | |
| 1301 | + | |
| 1302 | + | |
| 1303 | + | |
| 1304 | + | |
| 1305 | + | |
| 1306 | + | |
| 1307 | + | |
| 1308 | + | |
| 1309 | + | |
| 1310 | + | |
| 1311 | + | |
| 1312 | + | |
| 1313 | + | |
| 1314 | + | |
| 1315 | + | |
1285 | 1316 | | |
1286 | 1317 | | |
1287 | 1318 | | |
| |||
1292 | 1323 | | |
1293 | 1324 | | |
1294 | 1325 | | |
1295 | | - | |
| 1326 | + | |
1296 | 1327 | | |
1297 | 1328 | | |
1298 | 1329 | | |
| |||
1330 | 1361 | | |
1331 | 1362 | | |
1332 | 1363 | | |
1333 | | - | |
| 1364 | + | |
1334 | 1365 | | |
1335 | 1366 | | |
1336 | 1367 | | |
| |||
2251 | 2282 | | |
2252 | 2283 | | |
2253 | 2284 | | |
| 2285 | + | |
| 2286 | + | |
| 2287 | + | |
| 2288 | + | |
2254 | 2289 | | |
2255 | 2290 | | |
2256 | 2291 | | |
| |||
0 commit comments