`linera_core::client`: batch downloading of missing blobs (#4755) #4768

ma2bd · 2025-10-09T15:39:29Z

Motivation

Batch downloading missing blobs

Proposal

Backport #4755

Test Plan

CI

…4755) Currently, when we synchronize a chain, even though we receive the certificates in a batch to `process_certificates`, we handle them one by one at the local node level, and if a certificate is missing blobs we stop, download the blobs for that certificate, then retry, making the download of the blobs sequential. This makes startup time for the client linear in the number of certificates-with-blobs present in its initial chains (notably, the admin chain). We already have an ahead-of-time indicator of which blobs will be required by the certificates, so there's no need to download them one at a time. If we don't have some blobs marked as required by the certificate batch, try to download them (concurrently) before proceeding to process the batch. ~Thereafter, `BlobsNotFound` when processing the batch is a hard error.~ `required_blob_ids()` is conservative, so we still need to download blobs and retry if we get a `BlobsNotFound` error thereafter. CI. - These changes should be backported to the latest `testnet` branch, then - be released in a new SDK.

afck

Sorry, I just realized this is wrong. We have to revert (or fix) the original PR instead:

store_blobs only stores blobs if it has already seen proof for them, so process_certificate must be called first, otherwise the blob won't get written.

afck · 2025-10-09T16:14:10Z

We could instead call process_certificate in a loop for all the certificates, and make a note of which one caused the first BlobsNotFound error. Then collect all BlobsNotFound errors, download the blobs in a batch, store_blobs, and then call process_certificate in a loop again, starting from the first height where we got an error.

afck · 2025-10-09T16:35:34Z

Sorry, @Twey, @ma2bd, I missed that the fallback was already implemented! That fixes it, of course. (Although I think what I'm describing above might be a bit simpler in some cases.)

afck · 2025-10-09T16:36:03Z

linera-core/src/client/mod.rs

+                        self.download_blobs(remote_node, blob_ids).await?;
+                        self.handle_certificate(certificate).await?
+                    }
+                    x => x?,


(I'd prefer error or result.)

error would be incorrect. result is a good suggestion, but too late :)

Twey · 2025-10-09T17:36:14Z

Suggested approach in #4770.

ma2bd · 2025-10-09T23:32:03Z

Remote compatibility tests are not passing (twice). We may want to investigate

ma2bd · 2025-10-09T23:34:19Z

linera-core/src/client/mod.rs

+                &futures::stream::iter(blob_ids.into_iter().map(|blob_id| async move {
+                    remote_node.try_download_blob(blob_id).await.unwrap()
+                }))
+                .buffer_unordered(self.options.max_joined_tasks)
+                .collect::<Vec<_>>()
+                .await,


nit: Can we use let .. = for complex expressions in argument position?

I don't love the added mental load of introducing unnecessary variables. It makes it harder to read the code from top to bottom since you don't know how the thing is going to be used until later, compared to having the top-level/motivating call (store_blobs here) first.

It's sometimes worth it if it's really unclear what the value of the expression is without a name (though that's a bit of a code smell for the names of the functions &c. involved in the expressions), but in this case I think it's pretty obvious that this is a set of blobs.

(best of both worlds is obviously Haskell-esque

storeBlobs blobs where blobs = …

😄)

Our brains don't seem to work in the same way because for me this style hurts a lot

The intermediate variable (when named appropriately) acts as a summary of the complex expression.

let tasks = something_complicated(args).await?; // <-- can fail, can yield to another thread process_tasks(tasks); // <- do the thing

Values are protocols between a constructor and a consumer. That means there are two information flows that you want to make clear at the point of seeing a value: one is upstream ‘what the value is’; the other is downstream ‘how the value is going to be used’. Usually ‘what the value is’ is evident from ‘how the value is constructed’; sometimes there are also non-obvious invariants that aren't clear from the way the value is constructed, and then IMO it's worth giving the expression a name via variable assignment, named function argument, et cetera, though a better solution (IMO) is to make the constructing variable clearer, e.g. by extracting it into a function whose name or (better) type expresses the invariants.

In your example the problem is that tasks doesn't really tell you what the tasks are for. Are we going to count them? Track their progress? That doesn't just make it harder to see what the function as a whole does, it also might have a bearing on what the tasks themselves actually are above and beyond what's captured in the name: if we're planning to run the tasks then it's implied that they're in a runnable state (not errored, exited, et cetera). In a small two-line example like this you can quickly look at the next line to see how the information is flowing, but as functions get bigger it's easy to lose track, because you've essentially decomposed a tree:

A | \ B C | |\ D E F

into a (backwards!) list of edges:

C -> (E, F) D -> B A -> (B, C)

(and usually with more noise between them!). Now you have to pay attention to and remember all the names in order to be able to mentally reconstruct the (tree-shaped) information flow of the data, rather than it being inherent in the syntax.

Concretely, in this case, I'd probably call the variable blobs. But that defers the information about why we're interested in having a set of blobs (to store them)! You could call the variable blobs_to_pass_to_store_blobs but that's a little silly to do for everything, and to get the same level of information that's given by having syntactic nesting that reflects the flow of information you need to do this transitively:

let blob_ids_to_filter_to_download_to_store = get_blob_ids(); let filtered_blob_ids_to_download_to_store = filter(blob_ids_to_filter_to_store); let filtered_downloaded_blobs_to_store = download(filtered_blob_ids_to_download_to_store); store(filtered_downloaded_blobs_to_store);

In the case where the expression is very opaque, a quick fix can be to give it a descriptive name that describes its value:

let blobs = mysterious_function_that_could_return_anything();

This still obscures the information flow, but it's probably worth it if you don't want to refactor/rename the mysterious function into something more readable (a named function argument gives you the best of both worlds here). But in the case where the expression directly suggests what's being returned, e.g. when you're mapping a function called try_download_blob, declaring that it returns ‘blobs’ is information-free.

There's another argument in your comment about specifically the ? and .await, (I guess) implying that any expression ending with either of those should be assigned to a variable to make them ‘loud, explicit syntax’ as Stroustrup would say. I think this is somewhat a matter of opinion and culture, so I won't try to argue it from first principles, but Rust-the-language has made the decision that these operations are unsurprising enough today to be worthy of the terser syntax after extensive discussion (rather than, say, try or await statements/blocks) and I think idiomatic Rust shouldn't second-guess that.

deuszx · 2025-10-15T08:20:18Z

Why is this PR not being merged? RN we have a divergence between conway and main.

Twey · 2025-10-16T15:07:43Z

The remote compatibility test failure doesn't look related to this at all:

2025-10-09T22:32:32.550373Z ERROR linera: Error is Failed to close chain

Caused by:
    chain client error: Signer doesn't have key to sign for chain 726e27a961a72a7e32064a9bc7a68bbf6f825e07ae8b0282698a0215615bab85

deuszx · 2025-10-20T10:54:46Z

Are we going to merge this or not? AFAIU, it's on main anyway.

ma2bd changed the base branch from main to testnet_conway October 9, 2025 15:39

ma2bd force-pushed the backport_missing_blobs branch from 7713638 to b324eea Compare October 9, 2025 15:41

afck requested changes Oct 9, 2025

View reviewed changes

ma2bd closed this Oct 9, 2025

afck reopened this Oct 9, 2025

afck approved these changes Oct 9, 2025

View reviewed changes

ma2bd commented Oct 9, 2025

View reviewed changes

Twey added backport This PR backports functionality already available on `main`. performance labels Oct 16, 2025

Twey changed the title ~~Backport fix on missing blobs (#4755)~~ linera_core::client: batch downloading of missing blobs (#4755) Oct 16, 2025

ma2bd merged commit d9481a4 into linera-io:testnet_conway Oct 20, 2025
59 of 61 checks passed

ma2bd deleted the backport_missing_blobs branch October 20, 2025 15:05

linera_core::client: batch downloading of missing blobs (#4755) #4768

linera_core::client: batch downloading of missing blobs (#4755) #4768

Uh oh!

Conversation

ma2bd commented Oct 9, 2025 • edited by Twey Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Proposal

Test Plan

Uh oh!

afck left a comment

Choose a reason for hiding this comment

Uh oh!

afck commented Oct 9, 2025

Uh oh!

afck commented Oct 9, 2025

Uh oh!

afck Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Twey Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Twey commented Oct 9, 2025

Uh oh!

ma2bd commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ma2bd Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Twey Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Twey Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

ma2bd Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ma2bd Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Twey Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

deuszx commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Twey commented Oct 16, 2025

Uh oh!

deuszx commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

`linera_core::client`: batch downloading of missing blobs (#4755) #4768

`linera_core::client`: batch downloading of missing blobs (#4755) #4768

ma2bd commented Oct 9, 2025 •

edited by Twey

Loading

ma2bd commented Oct 9, 2025 •

edited

Loading

Twey Oct 16, 2025 •

edited

Loading

ma2bd Oct 16, 2025 •

edited

Loading

Twey Oct 16, 2025 •

edited

Loading

deuszx commented Oct 15, 2025 •

edited

Loading

deuszx commented Oct 20, 2025 •

edited

Loading