Update lobsters #358

byroot · 2025-01-29T18:47:40Z

THe initial snapshot was from a few years back, the application changed a lot since then.

I had to run a bunch of migrations and had to fiddle with various sqlite3/msyql incompatibilities. Also some route changes, but with a bit of effort I managed to make it work again.

What is interesting is it seems lobste.rs became almost 3x as slow in these couple years, I haven't yet profiled to know why.

This branch:

itr:   time
 #1: 1532ms
 #2: 1013ms
 #3:  910ms
 #4:  852ms
 #5:  851ms
 #6:  901ms
 #7:  857ms
 #8:  886ms
 #9:  875ms

main:

itr:   time
 #1:  884ms
 #2:  482ms
 #3:  344ms
 #4:  358ms
 #5:  300ms
 #6:  316ms
 #7:  315ms
 #8:  328ms
 #9:  309ms

cc @rwstauner

Also FYI @pushcx, I'll try to let you know if I find the source of this massive slow-down. But perhaps it's just a red herring.

byroot · 2025-01-29T20:24:23Z

Profile after the upgrade: https://share.firefox.dev/4hrxsdY
Profile before: https://share.firefox.dev/3PTZyCG

Observations:

Sqlite3#step (queries) went from 19% to 25%.
There is a brand new 6.7% spent in Kernel#caller, and that's coming from Prosopite.subscribe (this is very fishy).

Nothing else very obvious just yet, after that it's all < 2%, so I'll have to play seven differences on the flamegraph.

byroot · 2025-01-29T20:26:23Z

Kernel#caller, and that's coming from Prosopite.subscribe (this is very fishy).

Holy cow, it's digesting the entire backtrace to make a unique key for queries: https://github.com/charkost/prosopite/blob/36f7f5a4fa800fcfa1f9bef5cae94c9739b128fc/lib/prosopite.rb#L263-L272

pushcx · 2025-01-29T20:40:11Z

Thanks for pinging me, I'm very curious to hear what you find. Though we don't have APM set up, I think we would've noticed a prod slowdown that big, so Prosopite or another dev/test change seems plausible.

I should mention two in-progress projects, neither of which has merged to master:

We've started work migrating away from MariaDB. That issue only talks about postgres, but we've had some sqlite advocacy this week and the existence of this benchmark's partial port is encouraging. So either it's a lot of churn for you (postgres) or a lot less work for you (sqlite).
Tomorrow @joeldrapper and I are experimenting on stream with porting some partials to Phlex. I'm still uncertain about adopting a young gem as a dependency, but its design addresses a lot of pain points I've had. If we entirely migrate from Erb to Phlex that might reduce the codebase's value to you as a Rails benchmark.

(EDIT: I have been scratching my head for two days trying to remember where I first learned of Phlex, and it just clicked that it's this thread that also includes some criticism from you. Small world!)

byroot · 2025-01-29T21:21:08Z

so Prosopite or another dev/test change seems plausible.

Note that the benchmark runs in production environment. AFAICT this Prosopite overhead isn't dev/test only.

the existence of this benchmark's partial port is encouraging.

Not really, we're only testing a handful of endpoint, and I had to get rid of some queries. It's probably a tons of work to migrate, but that's your call, doesn't affect us much.

If we entirely migrate from Erb to Phlex that might reduce the codebase's value to you as a Rails benchmark.

Probably yes. We can always chose to stop following lobsters and just freeze the codebase forever, just to Rails upgrades once in a while.

maximecb · 2025-01-29T22:04:23Z

Worth profiling and maybe looking at how much code YJIT generates before/after since the change is so massive, also yjit insn executed count, percentage in YJIT.

byroot · 2025-01-29T22:37:26Z

how much code YJIT generates before/after

Before:

inline_code_size:    6,619,284
outlined_code_size:  2,340,100
code_region_size:    16,924,672
yjit_alloc_size:     25,453,141
yjit_compile_time:   1090.47ms
RSS: 372.3MiB
MAXRSS: 375.6MiB

After:

inline_code_size:    7,340,768
outlined_code_size:  2,462,532
code_region_size:    18,661,376
yjit_alloc_size:     27,652,073
yjit_compile_time:   1288.00ms
RSS: 330.3MiB
MAXRSS: 330.7MiB

byroot · 2025-01-30T08:03:10Z

Alright, so a major piece of that was my mistake. I thought the benchmark had caching entirely disabled, but actually no.

If I re-enable the cache like it was before, it's already much closer:

itr:   time
 #1: 1101ms
 #2:  655ms
 #3:  483ms
 #4:  481ms
 #5:  485ms
 #6:  484ms
 #7:  460ms
 #8:  531ms
 #9:  477ms
#10:  472ms
#11:  466ms
#12:  458ms
#13:  459ms
#14:  472ms
#15:  465ms
#16:  460ms
#17:  470ms
#18:  473ms
#19:  456ms
#20:  475ms
#21:  486ms
#22:  469ms
#23:  453ms
#24:  471ms
#25:  467ms

Now it's only about 50% slower. I'll see if I can figure out where the remaining difference comes from.

While upgrading the lobsters benchmark in yjit-bench I noticed 2.8% of the overall time spent in `Dir[]`, all coming from FileUpdateChecker, which shouldn't be a think in production. After tracking it down, it comes from mission-control. Profile: https://share.firefox.dev/4hBdygq Ref: ruby/ruby-bench#358

byroot · 2025-01-30T08:20:51Z

Found a 3% regression caused by mission-control: rails/mission_control-jobs#241

I'll keep digging.

byroot · 2025-01-30T08:35:59Z

Found a regression in psych: ruby/psych#713

byroot · 2025-01-30T08:40:36Z

Seems like a big regression is the /rss route, the new stories.rss.builder is much slower than the old home/rss.erb, but I guess that's expected, can't beat ERB with Ruby DSLs. lobsters/lobsters@629df02

While upgrading the lobsters benchmark in yjit-bench I noticed 2.8% of the overall time spent in `Dir[]`, all coming from FileUpdateChecker, which shouldn't be a think in production. After tracking it down, it comes from mission-control. Profile: https://share.firefox.dev/4hBdygq Ref: ruby/ruby-bench#358

byroot · 2025-01-30T08:56:37Z

Another important change seems to be:

get "/threads" => "comments#threads"

That became:

get "/threads" => "comments#user_threads"

And that new action a at least one order of magnitude more costly. But in a way that is legit, because the old action would simply redirect back to /active, so I think it was an oversight in the benchmark, as it didn't make much sense to benchmark that.

byroot · 2025-01-30T09:14:01Z

Alright, I went over all routes one by one, the final big difference comes from stories#show.

Somehow the stories#load_user_votes is way slower, enough to make that particular action 3x slower.

Noah had left a comment in the route generator that seems related:

    { num: 15, routes: ["/s/:story_id"] },  # not including upvote/downvote because that can change story visibility dynamically

Which makes me wonder if perhaps he had disabled some part of the action before, I haven't yet identified what.

byroot · 2025-01-30T09:35:06Z

Something interesting, but not necessarily a huge deal. The newer version of lobsters has way more handcrafted SQL, which causes Active Record to attempt to parse some of it to get some metadata.

E.g.:

    Comment
      .joins(<<~SQL
        inner join (
          with recursive discussion as (
          select
            c.id,
            0 as depth,
            (select count(*) from comments where parent_comment_id = c.id) as reply_count,
            cast(confidence_order as char(#{Comment::COP_LENGTH})) as confidence_order_path
            from comments c
            where
              thread_id in (#{thread_ids.join(", ")}) and
              parent_comment_id is null
          union all
          select
            c.id,
            discussion.depth + 1,
            (select count(*) from comments where parent_comment_id = c.id),
            cast(concat(
              discussion.confidence_order_path, 3 * (depth + 1),
              c.confidence_order
            ) as char(#{Comment::COP_LENGTH}))
          from comments c join discussion on c.parent_comment_id = discussion.id
          )
          select * from discussion as comments
        ) as comments_recursive on comments.id = comments_recursive.id
      SQL
            )
      .order("comments.thread_id desc, comments_recursive.confidence_order_path")
      .select('
        comments.*,
        comments_recursive.depth as depth,
        comments_recursive.reply_count as reply_count
      ')

Here since the joins is raw SQL, Active Record tries to parse that string to extract table names with a Regexp: https://github.com/rails/rails/blob/e03f459163a333642a62bf4569beae7e426c1a28/activerecord/lib/active_record/relation.rb#L1486-L1508

And since the string is pretty big, that end up accounting for 2% of the runtime on /s/:story_id.

THe initial snapshot was from a few years back, the application changed a lot since then. I had to run a bunch of migrations and had to fiddle with various sqlite3/msyql incompatibilities. Also some route changes, but with a bit of effort I managed to make it work again. What is interesting is it seems lobste.rs became 2-3-x as slow in these couple years, I haven't yet profiled to know why. This branch: ``` itr: time ruby#1: 1532ms ruby#2: 1013ms ruby#3: 910ms ruby#4: 852ms ruby#5: 851ms ruby#6: 901ms ruby#7: 857ms ruby#8: 886ms ruby#9: 875ms ``` main: ``` itr: time ruby#1: 884ms ruby#2: 482ms ruby#3: 344ms ruby#4: 358ms ruby#5: 300ms ruby#6: 316ms ruby#7: 315ms ruby#8: 328ms ruby#9: 309ms ```

byroot · 2025-01-30T09:46:06Z

Since I don't see any more obvious illegitimate slowdowns, here's main vs this branch:

main:

interp: ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin23]
yjit: ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin23]

--------  -----------  ----------  ---------  ----------  ------------  -----------
bench     interp (ms)  stddev (%)  yjit (ms)  stddev (%)  yjit 1st itr  interp/yjit
lobsters  459.7        2.4         296.9      3.0         0.558         1.548      
--------  -----------  ----------  ---------  ----------  ------------  -----------

updated-lobsters:

interp: ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin23]
yjit: ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin23]

--------  -----------  ----------  ---------  ----------  ------------  -----------
bench     interp (ms)  stddev (%)  yjit (ms)  stddev (%)  yjit 1st itr  interp/yjit
lobsters  600.6        1.9         453.1      2.2         0.543         1.326      
--------  -----------  ----------  ---------  ----------  ------------  -----------

Based on my profiling, most of the slowdown comes from logic that has moved from Ruby to SQL queries.

Now there's the question of whether we wish to upgrade or not, given that arguably this newer version of lobsters has much less Ruby code in it. On the other hand, the displayed speedup is more in line with what Rails apps seem to experience in production, so perhaps it's actually a good thing? Dunno.

maximecb · 2025-01-30T10:44:38Z

Thanks for the investigation Jean 🙏

How does the percentage in YJIT compare before/after?

Now there's the question of whether we wish to upgrade or not, given that arguably this newer version of lobsters has much less Ruby code in it. On the other hand, the displayed speedup is more in line with what Rails apps seem to experience in production, so perhaps it's actually a good thing? Dunno.

Yeah I'm torn. On the one hand it's nice if the benchmark is bigger. On the other hand we kind of need as much Ruby code as possible to focus our optimization efforts I feel. When YJIT team members try to implement small optimizations, we look at performance differences on lobsters a lot (because it's the biggest benchmark). If the difference is below 1%, it's hard for us to make the call.

byroot · 2025-01-30T10:57:04Z

How does the percentage in YJIT compare before/after?

Good question, I need to figure out how to extract that one.

byroot · 2025-01-30T11:04:23Z

Ok, found it. Full YJIT stats in https://gist.github.com/byroot/e0d86a2ab174b140968f3b9b8e3d439d

Before:

ratio_in_yjit:                 99.0%
avg_len_in_yjit:                63.5

After:

ratio_in_yjit:                 98.9%
avg_len_in_yjit:                65.3

maximecb · 2025-02-10T18:08:29Z

Tagging @XrXr to have his opinion.

If the slowdown is indeed more logic moving to SQL, then we may want to either:

Only update rails without changing the bench
Or, try to make the SQL faster? We should be using an in-memory database like sqlite if we aren't already?

byroot · 2025-02-10T18:09:54Z

We should be using an in-memory database like sqlite if we aren't already?

We are already.

XrXr

It looks like it's the mostly the same amount of ruby work, but there is now more non-ruby workload in the mix to make it slower overall. I share the concern about having a weaker signal for small changes, but a 30% speedup stills seems like a good chunk of the workload. Tracking new changes to keep benchmark relevant is probably important (Rails 8 instead of 7, for example).

I vote that we upgrade.

XrXr · 2025-02-11T17:57:54Z

benchmarks/lobsters/Gemfile

@@ -1,75 +1,82 @@
 source "https://rubygems.org"

-# Everything except Action Cable. It's unused and it installs native gems.


If we upgrade, I'd like to get rid of nio4r, still, since the bench doesn't use Action Cable. We can do that in a separate PR, though.

maximecb · 2025-02-11T18:27:20Z

I vote that we upgrade.

Jean suggested we could just upgrade Rails while keeping the benchmark largely the same and it shouldn't be too much trouble? We're agreeing on going that path?

I'd like to get rid of nio4r, still, since the bench doesn't use Action Cable. We can do that in a separate PR, though.

👍

XrXr · 2025-02-11T19:47:19Z

Jean suggested we could just upgrade Rails while keeping the benchmark largely the same and it shouldn't be too much trouble? We're agreeing on going that path?

That sounds good to me too!

Fix: ruby#358 Before: ``` inline_code_size: 6,752,176 outlined_code_size: 2,067,600 code_region_size: 17,268,736 yjit_alloc_size: 25,934,245 yjit_compile_time: 989.69ms RSS: 393.7MiB MAXRSS: 393.9MiB Average of last 15, non-warmup iters: 299ms Total time spent benchmarking: 27s interp: ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin23] yjit: ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin23] -------- ----------- ---------- --------- ---------- ------------ ----------- bench interp (ms) stddev (%) yjit (ms) stddev (%) yjit 1st itr interp/yjit lobsters 463.9 1.3 299.8 2.6 0.563 1.547 -------- ----------- ---------- --------- ---------- ------------ ----------- ``` After: ``` inline_code_size: 6,808,680 outlined_code_size: 2,178,968 code_region_size: 17,448,960 yjit_alloc_size: 26,006,453 yjit_compile_time: 978.48ms RSS: 380.1MiB MAXRSS: 380.3MiB Average of last 16, non-warmup iters: 296ms Total time spent benchmarking: 27s interp: ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin23] yjit: ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin23] -------- ----------- ---------- --------- ---------- ------------ ----------- bench interp (ms) stddev (%) yjit (ms) stddev (%) yjit 1st itr interp/yjit lobsters 463.1 1.8 296.3 2.2 0.596 1.563 -------- ----------- ---------- --------- ---------- ------------ ----------- ```

Fix: #358 Before: ``` inline_code_size: 6,752,176 outlined_code_size: 2,067,600 code_region_size: 17,268,736 yjit_alloc_size: 25,934,245 yjit_compile_time: 989.69ms RSS: 393.7MiB MAXRSS: 393.9MiB Average of last 15, non-warmup iters: 299ms Total time spent benchmarking: 27s interp: ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin23] yjit: ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin23] -------- ----------- ---------- --------- ---------- ------------ ----------- bench interp (ms) stddev (%) yjit (ms) stddev (%) yjit 1st itr interp/yjit lobsters 463.9 1.3 299.8 2.6 0.563 1.547 -------- ----------- ---------- --------- ---------- ------------ ----------- ``` After: ``` inline_code_size: 6,808,680 outlined_code_size: 2,178,968 code_region_size: 17,448,960 yjit_alloc_size: 26,006,453 yjit_compile_time: 978.48ms RSS: 380.1MiB MAXRSS: 380.3MiB Average of last 16, non-warmup iters: 296ms Total time spent benchmarking: 27s interp: ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin23] yjit: ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin23] -------- ----------- ---------- --------- ---------- ------------ ----------- bench interp (ms) stddev (%) yjit (ms) stddev (%) yjit 1st itr interp/yjit lobsters 463.1 1.8 296.3 2.2 0.596 1.563 -------- ----------- ---------- --------- ---------- ------------ ----------- ```

pushcx · 2025-05-06T23:25:21Z

I found this again while cleaning out some old tabs. I hadn't wanted to distract from your PR work.

Seems like a big regression is the /rss route, the new stories.rss.builder is much slower than the old home/rss.erb, but I guess that's expected, can't beat ERB with Ruby DSLs.

I rewrote this because the ERB produced invalid RSS. It's the same reason I'm looking to replace ERB at the view layer, we're constantly firefighting very basic HTML bugs like missing closing tags. Lobsters didn't get slower because it replaced ERB with a builder, it got correct.

ERB reminds me very strongly of 1990s C++ where a tool supposedly produced high-performance code, but it achieved it by completely lacking any kind of safety. The C++ culture reinforced it by blaming the developer for bugs that tools could have prevented. If ERB can't help this program produce correct output, it's going to get replaced with one that can. I don't have large amounts of developer time to throw after the unending stream of bugs it produces, and I definitely have too much professional self-respect to accept that correctness should be, or even can be handled by blaming individuals instead of developing reliable systems and processes.

I saw the newly-announced herb tooling and I'll give that a few months to see if it produces tools like to address ERB's fundamental shortcomings. I'm not optimistic for staying. To echo your comment: I guess that's expected, can't beat a Ruby DSL with string soup.

byroot · 2025-05-09T07:11:10Z

I saw the newly-announced herb tooling and I'll give that a few months to see if it produces tools like to address ERB's fundamental shortcomings.

herb look nice, but short term https://github.com/Shopify/erb_lint has existed for a while.

byroot mentioned this pull request Jan 30, 2025

Stop checking for file updates in production rails/mission_control-jobs#241

Merged

byroot force-pushed the update-lobsters branch from 202d155 to d332d91 Compare January 30, 2025 09:37

maximecb assigned XrXr Feb 10, 2025

XrXr approved these changes Feb 11, 2025

View reviewed changes

byroot mentioned this pull request Feb 11, 2025

Update lobsters benchmark to Rails 8.0 #360

Merged

maximecb closed this in #360 Feb 11, 2025

pushcx mentioned this pull request Oct 14, 2025

Fix HTML markup errors found by Herb lobsters/lobsters#1754

Merged

		@@ -1,75 +1,82 @@
		source "https://rubygems.org"

		# Everything except Action Cable. It's unused and it installs native gems.

Update lobsters #358

Update lobsters #358

Uh oh!

Conversation

byroot commented Jan 29, 2025

Uh oh!

byroot commented Jan 29, 2025

Uh oh!

byroot commented Jan 29, 2025

Uh oh!

pushcx commented Jan 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

byroot commented Jan 29, 2025

Uh oh!

maximecb commented Jan 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

byroot commented Jan 29, 2025

Uh oh!

byroot commented Jan 30, 2025

Uh oh!

byroot commented Jan 30, 2025

Uh oh!

byroot commented Jan 30, 2025

Uh oh!

byroot commented Jan 30, 2025

Uh oh!

byroot commented Jan 30, 2025

Uh oh!

byroot commented Jan 30, 2025

Uh oh!

byroot commented Jan 30, 2025

Uh oh!

byroot commented Jan 30, 2025

Uh oh!

maximecb commented Jan 30, 2025

Uh oh!

byroot commented Jan 30, 2025

Uh oh!

byroot commented Jan 30, 2025

Uh oh!

maximecb commented Feb 10, 2025

Uh oh!

byroot commented Feb 10, 2025

Uh oh!

XrXr left a comment

Choose a reason for hiding this comment

Uh oh!

XrXr Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

maximecb commented Feb 11, 2025

Uh oh!

XrXr commented Feb 11, 2025

Uh oh!

pushcx commented May 6, 2025

Uh oh!

byroot commented May 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pushcx commented Jan 29, 2025 •

edited

Loading

maximecb commented Jan 29, 2025 •

edited

Loading