-
Notifications
You must be signed in to change notification settings - Fork 175
Further optimize ripper translator by not using delegate
#3868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ebf61ff to
111c256
Compare
| # little easier to work with. We delegate all other methods to the array. | ||
| class Token < SimpleDelegator | ||
| # @dynamic initialize, each, [] | ||
| class Token < BasicObject |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to not just subclass Array?
That would avoid an extra indirection and (in theory at least) be more compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this but then ignores the == from the subclass. Somehow also slower on cruby
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, that's subtle, I think it's because of these semantics: https://github.com/truffleruby/truffleruby/blob/a48ecf894f19636f09235f5ac70ea6d9889be255/src/main/ruby/truffleruby/core/array.rb#L129-L132
i.e. if other is not an Array then always use other == self, but if other is an Array then just compare all elements and don't call other.==
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change turned out to reveal a bug in TruffleRuby, as TruffleRuby wouldn't destructure/splat such "BasicObject arrays" since BasicObject doesn't have respond_to?.
Fixed in truffleruby/truffleruby#4128
Fixing/revealing that bug is good, but perf-wise having to go through that slow path coercion (rb_convert_type()) doesn't seem great, I think we should use regular arrays like ::Ripper and then maybe have some private constants so we can do things like:
token[LOCATION]
token[EVENT]
token[VALUE]
token[STATE]Any thoughts on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Token subclasses we'd still need to use those to have the custom ==, but for instances of Token we could use a plain Array instead.
It would also mean less allocations which is always helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this https://github.com/ruby/prism/pull/3868/changes#r2724651412? Sorry, I didn't see that comment at all. Would defining to_ary do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would still go through rb_convert_type() but more direct than without:
https://github.com/truffleruby/truffleruby/blob/4180ab960fe223d333e86b6f3263eb1e29d8f942/src/main/ruby/truffleruby/core/type.rb#L301
vs
https://github.com/truffleruby/truffleruby/blob/4180ab960fe223d333e86b6f3263eb1e29d8f942/src/main/ruby/truffleruby/core/type.rb#L303
So yeah that'd be good but even better would be to use real arrays as I'd think e.g. all method lookups in rb_convert_type() are uncached on CRuby, and likely megamorphic on TruffleRuby.
|
Great speedup and nice to the extra |
Using it seems pretty bad for performance:
```rb
require "benchmark/ips"
require "prism"
require "ripper"
codes = Dir["**/*.rb"].map { File.read(it) }
Benchmark.ips do |x|
x.report("prism") { codes.each { Prism::Translation::Ripper.lex(it) } }
x.report("ripper") { codes.each { Ripper.lex(it) } }
x.compare!
end
```
```
# Before
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [x86_64-linux]
Warming up --------------------------------------
prism 1.000 i/100ms
ripper 1.000 i/100ms
Calculating -------------------------------------
prism 0.319 (± 0.0%) i/s (3.14 s/i) - 2.000 in 6.276154s
ripper 0.647 (± 0.0%) i/s (1.54 s/i) - 4.000 in 6.182662s
Comparison:
ripper: 0.6 i/s
prism: 0.3 i/s - 2.03x slower
# After
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [x86_64-linux]
Warming up --------------------------------------
prism 1.000 i/100ms
ripper 1.000 i/100ms
Calculating -------------------------------------
prism 0.482 (± 0.0%) i/s (2.08 s/i) - 3.000 in 6.225603s
ripper 0.645 (± 0.0%) i/s (1.55 s/i) - 4.000 in 6.205636s
Comparison:
ripper: 0.6 i/s
prism: 0.5 i/s - 1.34x slower
```
`vernier` tells me it does `method_missing` even for explicitly defined methods like `location`.
111c256 to
2ea8139
Compare
| # We want to pretend that this is just an Array. | ||
| def ==(other) # :nodoc: | ||
| @array == other | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For speed it might be relevant to define to_ary, since that will be checked by Array#== and more methods when using this Token as an Array (e.g. https://github.com/truffleruby/truffleruby/blob/a48ecf894f19636f09235f5ac70ea6d9889be255/src/main/ruby/truffleruby/core/array.rb#L130).
Defining to_ary will avoid going through respond_to_missing? for that case.
Using it seems pretty bad for performance:
verniertells me it doesmethod_missingeven for explicitly defined methods likelocation. Probably that is for[].