Skip to content

Conversation

@MrMarvin
Copy link

What was the end-user or developer problem that led to this PR?

During bundler install runs, most prominently as part of a CI / container building pipeline, I was wondering which of our (many) gems where having the most impact on runtime. For a larger-ish project with some hundred+ gems a typical bundler install run can take multiple minutes. As of right now, there was no good way (that we knew of) to dissect which individual gems take the majority of time.
Native extensions are an obvious culprit and so are network download times.

We've been hacking a small plugin called Bundler-Timing, which gives per-gem install timing statistics. However this approach was lacking more detailed events to determine:

  • How long a gem source took to download, as part of the overall installation
  • How expensive was the git checkout operation

What is your fix for the problem, implemented in this PR?

Introduce two new sets of plugin events:

  • GEM_BEFORE_FETCH and GEM_AFTER_FETCH
  • GIT_BEFORE_FETCH and GIT_AFTER_FETCH

Please feel free to see the above mentioned plugin for an example implementing the hooks.

If introducing new hooks is generally acceptable for this project and there is no concerns moving forward, I'll add some specs as well. Right now I'm happy with interactively experimenting using our timing plugin.

Make sure the following tasks are checked

@welcome
Copy link

welcome bot commented Oct 17, 2024

Thanks for opening a pull request and helping make RubyGems and Bundler better! Someone from the RubyGems team will take a look at your pull request shortly and leave any feedback. Please make sure that your pull request has tests for any changes or added functionality.

We use GitHub Actions to test and make sure your change works functionally and uses acceptable conventions, you can review the current progress of GitHub Actions in the PR status window below.

If you have any questions or concerns that you wish to ask, feel free to leave a comment in this PR or join our #rubygems or #bundler channel on Slack.

For more information about contributing to the RubyGems project feel free to review our CONTRIBUTING guide

@MrMarvin MrMarvin force-pushed the mf/fetch_timing_hooks branch from c402f3b to 6392a2b Compare October 17, 2024 10:39
@indirect
Copy link
Contributor

This is very cool! A timing profiler gem sounds extremely useful. Thanks for building that, and for proposing these hooks.

Do you think it still makes sense to still fire the hooks even if the gem has no remote and cannot be fetched, and so the interior is just a no-op?

@MrMarvin MrMarvin force-pushed the mf/fetch_timing_hooks branch from 6392a2b to 11fcf0d Compare February 8, 2025 14:15
@MrMarvin
Copy link
Author

MrMarvin commented Feb 8, 2025

Hi @indirect - took me a while to get back to this.

Since then:

  1. I've rebased to current master, resolving a single conflict and force pushed PR branch.
  2. Added to specs for both regular (https) gem fetches as well as git based fetch/checkouts hooks.

Its likely I don't fully grok the details of git based fetches yet, so might be missing something. Can you clarify your earlier question

Do you think it still makes sense to still fire the hooks even if the gem has no remote and cannot be fetched, and so the interior is just a no-op?

What is an example of a git source gem without a remote? 🤔
I've tested with both local file based repositories (used in specs) and with gems pulled from GitHub interactively. I think both cases should follow the same principle of emitting hooks, even if the timing information derived from a local clone will be very different to remote network fetches.

@indirect
Copy link
Contributor

indirect commented Feb 9, 2025

Very cool, nice work. My question was about regular gems with no remote source, like :path gems that are just files already present on disk, and I guess it makes sense to emit hooks regardless, even if no fetching is taking place.

@deivid-rodriguez
Copy link
Contributor

This seems related to #3305, and #7622. I think install time should be provided by Bundler by default, at least in verbose mode.

Can you imagine other use cases for these hooks?

Overall I'm not too convinced about adding more hooks right now because I feel we should focus on improving our plugin system first, so it gets more appealing (hopefully I can resume work on #6957 soon). For example, we recently added #3439 but I have not heard of any actual real usage yet.

Besides I'm not sure these hooks will be truly representative of "network time". When I've seen Bundler getting slow due to network operations, it's been usually when downloading the compact index, and I think that's not captured by these hooks. On the other hand, download time between individual gems is usually very similar and if it's slow it's usually an overall problem not caused by the size of a particular gem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants