Skip to content

Concurrently download components of a toolchain #4436

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

FranciscoTGouveia
Copy link
Contributor

@FranciscoTGouveia FranciscoTGouveia commented Aug 3, 2025

As of now, each component of a toolchain is downloaded sequentially, leaving room for performance improvements.
This is particularly notorious when considering interleaving downloads with installation, as discussed in #731.

Following the work made in #4426, as the DownloadTracker is now indicatif-based, doing the progress reporting for concurrent downloads is much easier.
As such, this PR comes to enable downloading different components concurrently (possibly closing #3018).

Performance-wise, benchmarks using hyperfine showed a small (1.1x) speedup -- which doesn't seem noticeable but lays the foundations for a future PR to interleave the download with the installation (of components).
The benchmarks were ran 50 times (with 5 warmup runs) over a 50 Mbps connection, each downloading an entire toolchain.

Many thanks to @rami3l and his PR for refactoring part of the test suite for the purpose of this PR.

@FranciscoTGouveia
Copy link
Contributor Author

For reference, the current download of components (sequential) looks like this:

before-ezgif com-speed

... and the changes introduced by this PR go like this:

after-ezgif com-speed

Please note that both animations were sped up and they are not meaningful for comparing performances.

@FranciscoTGouveia FranciscoTGouveia force-pushed the concurrent-downloads branch 3 times, most recently from 10d1514 to 7093333 Compare August 5, 2025 13:13
@FranciscoTGouveia FranciscoTGouveia marked this pull request as ready for review August 5, 2025 13:24
@rami3l rami3l self-assigned this Aug 10, 2025
@@ -14,11 +14,11 @@ pub enum Notification<'a> {
RemovingDirectory(&'a str, &'a Path),
DownloadingFile(&'a Url, &'a Path),
/// Received the Content-Length of the to-be downloaded data.
DownloadContentLengthReceived(u64),
DownloadContentLengthReceived(u64, &'a str),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind explaining the meaning of "" here and why they are required? It might be an interesting addition to the doc comments as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are downloading the components concurrently, we can’t know which download the Content-Length refers to. Therefore, in addition to receiving the length itself, we also need the component’s URL so we can match it to the corresponding progress bar.

Note: I've already added a doc comment. Thank you!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before resolving this conversation, I will investigate the correct use of this notification and the addition of this extra parameter to ensure that, even though the installations are not concurrent, it did not have any negative effect on them (which, unfortunately, cannot be guaranteed by the test suite).

rami3l

This comment was marked as duplicate.

)
.unwrap()
.progress_chars("## "),
);
pb.set_message(component.to_string());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: .to_string() doesn't seem necessary here...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be misreading the situation here, but I think we actually need the .to_string() here.

This is due to the fact that the set_message() expects Into<Cow<'static, str>>. and we would try to pass a &str with a non-'static lifetime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO to_owned() more clearly expresses what you're trying to do (and might be slightly more efficient). In this case since create_progress_bar() needs two copies, suggest taking the argument as String and cloning it for use in set_message().

Some notifications needed to be updated to include the download URL,
enabling the identification of the component being downloaded. This was
necessary for accurate progress reporting of each component.
@FranciscoTGouveia
Copy link
Contributor Author

After my investigation stemming from this comment, I have concluded that the changes presented on this PR are not consistent with the previous behavior of displaying a progress bar during the installation of the component.

Would this be necessary, especially considering that the next step will be to interleave the downloads with the installation?

@rami3l
Copy link
Member

rami3l commented Aug 12, 2025

I have concluded that the changes presented on this PR are not consistent with the previous behavior of displaying a progress bar during the installation of the component.

Would this be necessary, especially considering that the next step will be to interleave the downloads with the installation?

@FranciscoTGouveia No, I don't think it is necessary for that part of the behavior to stay unchanged, as you have already mentioned. Of course, if it is easy to add it back, then it might still be interesting to do so.


things_to_install.push((component, format, downloaded_file));
let component_stream =
tokio_stream::iter(components.into_iter()).map(|(component, format, url, hash)| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have to consider the restriction of concurrent downloads similar to what we have done with threadpooled diskio.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This closure is complicated enough that I think it should be extracted into a separate function or struct.

Comment on lines +65 to +68
// During the installation this function can be called with an empty file/URL.
if file.is_empty() {
return None;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does that happen? Seems pretty iffy.

Comment on lines +69 to +74
let component = self
.file_progress_bars
.keys()
.find(|comp| file.contains(*comp))?;

self.file_progress_bars.get(component)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this elaborate setup? Are file values not the same as the components?

self.multi_progress_bars.remove(&self.progress_bar);
self.progress_bar = ProgressBar::hidden();
pub(crate) fn download_finished(&mut self, file: &str) {
if let Some(pb) = self.progress_bar(file) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: let Some(pb) = .. else { return };.

Comment on lines +115 to +116
let msg = pb.message();
pb.finish_with_message(msg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems weird? Could instead probably use one of the other ProgressFinish modes.

)
.unwrap()
.progress_chars("## "),
);
pb.set_message(component.to_string());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO to_owned() more clearly expresses what you're trying to do (and might be slightly more efficient). In this case since create_progress_bar() needs two copies, suggest taking the argument as String and cloning it for use in set_message().

@@ -263,10 +263,11 @@ impl Executor for Threaded<'_> {
// pretend to have bytes to deliver.
let mut prev_files = self.n_files.load(Ordering::Relaxed);
if let Some(handler) = self.notify_handler {
handler(Notification::DownloadFinished);
handler(Notification::DownloadFinished(""));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems strange. Why pass in an empty string?

handler(Notification::DownloadPushUnit(Unit::IO));
handler(Notification::DownloadContentLengthReceived(
prev_files as u64,
"",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. What does the empty string mean? Should the type be Option<_> instead?

Ok::<_, Error>((component, format, downloaded_file, hash))
}
});
if components_len > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the only use of components_len, suggest storing a bool instead.


things_to_install.push((component, format, downloaded_file));
let component_stream =
tokio_stream::iter(components.into_iter()).map(|(component, format, url, hash)| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This closure is complicated enough that I think it should be extracted into a separate function or struct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants