Skip to content

Conversation

@byroot
Copy link
Member

@byroot byroot commented Nov 1, 2025

Closes: #881

If we encounter a newline, it is likely that the document is pretty printed, hence that the newline is followed by multiple spaces.

In such case we can use SWAR to count up to eight consecutive spaces at once.

== Parsing activitypub.json (58160 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after     1.118k i/100ms
Calculating -------------------------------------
               after     11.223k (± 0.7%) i/s   (89.10 μs/i) -     57.018k in   5.080522s

Comparison:
              before:    10834.4 i/s
               after:    11223.4 i/s - 1.04x  faster

== Parsing twitter.json (567916 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after   118.000 i/100ms
Calculating -------------------------------------
               after      1.188k (± 1.0%) i/s  (841.62 μs/i) -      6.018k in   5.065355s

Comparison:
              before:     1094.8 i/s
               after:     1188.2 i/s - 1.09x  faster

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after    58.000 i/100ms
Calculating -------------------------------------
               after    570.506 (± 3.7%) i/s    (1.75 ms/i) -      2.900k in   5.091529s

Comparison:
              before:      419.6 i/s
               after:      570.5 i/s - 1.36x  faster

== Parsing float parsing (2251051 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after    22.000 i/100ms
Calculating -------------------------------------
               after    212.010 (± 1.9%) i/s    (4.72 ms/i) -      1.078k in   5.086885s

Comparison:
              before:      189.4 i/s
               after:      212.0 i/s - 1.12x  faster

FYI: @samyron

Closes: ruby#881

If we encounter a newline, it is likely that the document is pretty printed,
hence that the newline is followed by multiple spaces.

In such case we can use SWAR to count up to eight consecutive spaces at once.

```
== Parsing activitypub.json (58160 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after     1.118k i/100ms
Calculating -------------------------------------
               after     11.223k (± 0.7%) i/s   (89.10 μs/i) -     57.018k in   5.080522s

Comparison:
              before:    10834.4 i/s
               after:    11223.4 i/s - 1.04x  faster

== Parsing twitter.json (567916 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after   118.000 i/100ms
Calculating -------------------------------------
               after      1.188k (± 1.0%) i/s  (841.62 μs/i) -      6.018k in   5.065355s

Comparison:
              before:     1094.8 i/s
               after:     1188.2 i/s - 1.09x  faster

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after    58.000 i/100ms
Calculating -------------------------------------
               after    570.506 (± 3.7%) i/s    (1.75 ms/i) -      2.900k in   5.091529s

Comparison:
              before:      419.6 i/s
               after:      570.5 i/s - 1.36x  faster

== Parsing float parsing (2251051 bytes)
ruby 3.4.6 (2025-09-16 revision dbd83256b1) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after    22.000 i/100ms
Calculating -------------------------------------
               after    212.010 (± 1.9%) i/s    (4.72 ms/i) -      1.078k in   5.086885s

Comparison:
              before:      189.4 i/s
               after:      212.0 i/s - 1.12x  faster
```

Co-Authored-By: Scott Myron <[email protected]>
@byroot byroot force-pushed the parser-whitespace-switch branch from 9cd6375 to b3fd7b2 Compare November 1, 2025 11:55
@byroot byroot merged commit acbf40b into ruby:master Nov 1, 2025
37 checks passed
@samyron
Copy link
Contributor

samyron commented Nov 1, 2025

Thank you for the improvements!

@byroot byroot deleted the parser-whitespace-switch branch November 1, 2025 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants