Skip to content

Conversation

teor2345
Copy link
Member

@teor2345 teor2345 commented Apr 23, 2025

This PR upgrades:

It adds an inbound substream timeout to the kad protocol, which matches the outbound substream timeout. This prevents "substream limit exceeded" errors under load, caused by the outbound side timing out, but the inbound side keeping on waiting.

See PR autonomys/rust-libp2p#2 for full details.

Close #3450.

This code has been tested on multiple operator nodes, and it prevents the substream limit exceeded errors.

Note about patching

You usually can't patch one crate version with another version, you have to change the crate fork/source as well:
https://doc.rust-lang.org/cargo/reference/overriding-dependencies.html#the-patch-section

The patch only works because cargo considers https://github.com/subspace/rust-libp2p and https://github.com/autonomys/rust-libp2p to be different repositories. Because our polkadot-sdk fork depends on the autonomys URL, we can patch using the subspace URL.

Code contributor checklist:

@teor2345 teor2345 added bug Something isn't working networking Subspace networking (DSN) performance Related to performance measurement or improvement labels Apr 23, 2025
@teor2345 teor2345 requested a review from vedhavyas April 23, 2025 04:49
@teor2345 teor2345 self-assigned this Apr 23, 2025
@teor2345 teor2345 force-pushed the kad-in-subst-timeout branch from 3f7942c to 26d98fa Compare April 25, 2025 02:53
@teor2345 teor2345 marked this pull request as ready for review April 28, 2025 23:00
@teor2345 teor2345 force-pushed the kad-in-subst-timeout branch from 26d98fa to 7e3d618 Compare April 28, 2025 23:00
@teor2345 teor2345 requested a review from nazar-pc as a code owner April 28, 2025 23:00
@teor2345 teor2345 enabled auto-merge April 28, 2025 23:00
Comment on lines +413 to +422
[patch."https://github.com/autonomys/rust-libp2p.git"]
# Patch away `libp2p` in our dependency tree with the git version.
# This brings the fixes in our `libp2p` fork into substrate's dependencies.
#
# This is a hack: patches to the same repository are rejected by `cargo`. But it considers
# "subspace/rust-libp2p" and "autonomys/rust-libp2p" to be different repositories, even though
# they're redirected to the same place by GitHub, so it allows this patch.
libp2p = { git = "https://github.com/subspace/rust-libp2p", rev = "399c4c7fcba821a7bac2663e090a7b155b08ebe1" }
libp2p-identity = { git = "https://github.com/subspace/rust-libp2p", rev = "399c4c7fcba821a7bac2663e090a7b155b08ebe1" }
multistream-select = { git = "https://github.com/subspace/rust-libp2p", rev = "399c4c7fcba821a7bac2663e090a7b155b08ebe1" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really awkward. I understand why it works, but is it really not possible to patch autonomys with a different autonomys revision? Or wouldn't it converge on the same revision automatically on its own?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, patches can only come from different sources, because [patch] is a source-to-source replacement, filtered by crate name:
https://doc.rust-lang.org/cargo/reference/overriding-dependencies.html#the-patch-section

It's not super clear in the documentation, but cargo returns the error:

patch for libp2p in https://github.com/autonomys/subspace points to the same source, but patches must point to different sources

https://github.com/rust-lang/cargo/blob/ab15d58608e1d6895fae4ab9a400efe49a75a13f/src/cargo/core/registry.rs#L441

And it does canonicalisation, which means we can't just use subspace.git and subspace.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also can't convince it to converge on the same revision, because that only works for crates.io and similar registries. Git versions are fixed to the specific commit hash.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very unfortunate 😕

Comment on lines +413 to +422
[patch."https://github.com/autonomys/rust-libp2p.git"]
# Patch away `libp2p` in our dependency tree with the git version.
# This brings the fixes in our `libp2p` fork into substrate's dependencies.
#
# This is a hack: patches to the same repository are rejected by `cargo`. But it considers
# "subspace/rust-libp2p" and "autonomys/rust-libp2p" to be different repositories, even though
# they're redirected to the same place by GitHub, so it allows this patch.
libp2p = { git = "https://github.com/subspace/rust-libp2p", rev = "399c4c7fcba821a7bac2663e090a7b155b08ebe1" }
libp2p-identity = { git = "https://github.com/subspace/rust-libp2p", rev = "399c4c7fcba821a7bac2663e090a7b155b08ebe1" }
multistream-select = { git = "https://github.com/subspace/rust-libp2p", rev = "399c4c7fcba821a7bac2663e090a7b155b08ebe1" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very unfortunate 😕

@teor2345 teor2345 added this pull request to the merge queue Apr 29, 2025
Merged via the queue into main with commit ed7c752 Apr 29, 2025
12 checks passed
@teor2345 teor2345 deleted the kad-in-subst-timeout branch April 29, 2025 01:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working networking Subspace networking (DSN) performance Related to performance measurement or improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

libp2p_kad::handler: New inbound substream to peer exceeds inbound substream limit
2 participants