feat: implement state_getReadProof#2136
Conversation
|
Unfortunately, this is a tricky one to implement due to the fact that the function parameters accepts more than one key.
Yes, the behavior you need is very close to what the
At the moment in your PR everything is done in the same step. However you've forgotten a critical operation: verifying that the proof is valid. In order to verify this, you need the value of |
lib/src/trie/proof_decode.rs
Outdated
| pub fn decode_to_raw_entries(proof: &[u8]) -> Result<Vec<&[u8]>, Error> { | ||
| // TODO: don't use a Vec? | ||
| let (_, decoded_proof) = nom::Parser::parse( | ||
| &mut nom::combinator::all_consuming(nom::combinator::flat_map( | ||
| crate::util::nom_scale_compact_usize, | ||
| |num_elems| nom::multi::many_m_n(num_elems, num_elems, crate::util::nom_bytes_decode), | ||
| )), | ||
| proof.as_ref(), | ||
| ) | ||
| .map_err(|_: nom::Err<nom::error::Error<&[u8]>>| Error::InvalidFormat)?; | ||
|
|
||
| Ok(decoded_proof) | ||
| } |
There was a problem hiding this comment.
I moved this snippet out from decode_and_verify_proof, as I needed just the entries SCALE-decoded.
There was a problem hiding this comment.
I would prefer to revert this change. SCALE-decoding a proof takes in the order of nanoseconds. Using a Vec<u8> and decoding it on the fly is most probably faster than using a Vec<Vec<u8>>, and even if it wasn't I much prefer the consistency of having a proof always be a Vec<u8> no matter the context.
| for entry in proof { | ||
| in_progress_results.insert(entry); | ||
| } |
There was a problem hiding this comment.
Using a HashSet to remove duplicates, which is all I need to get the common proof. Maybe I should use a HashMap<{entry hash}, {entry value}>?
There was a problem hiding this comment.
If you switch proofs to being a Vec<u8> as per my other review comments, you should collect them in a Vec<Vec<u8>> (a vec of proofs), then decode and remove duplicates right before returning the final result.
Merging proofs should probably be its own function in the smoldot-lib/trie module.
b2b09a8 to
0e3d1de
Compare
|
I think it's mostly ready for review @tomaka I have another question: I'm relying on |
light-base/src/sync_service.rs
Outdated
| /// Corresponds to a [`StorageRequestItemTy::Hash`]. | ||
| MerkleProof { | ||
| /// Merkle proof of the storage value of the key. | ||
| proof: Vec<Vec<u8>>, |
There was a problem hiding this comment.
| proof: Vec<Vec<u8>>, | |
| proof: Vec<u8>, |
lib/src/trie/proof_decode.rs
Outdated
| pub fn decode_to_raw_entries(proof: &[u8]) -> Result<Vec<&[u8]>, Error> { | ||
| // TODO: don't use a Vec? | ||
| let (_, decoded_proof) = nom::Parser::parse( | ||
| &mut nom::combinator::all_consuming(nom::combinator::flat_map( | ||
| crate::util::nom_scale_compact_usize, | ||
| |num_elems| nom::multi::many_m_n(num_elems, num_elems, crate::util::nom_bytes_decode), | ||
| )), | ||
| proof.as_ref(), | ||
| ) | ||
| .map_err(|_: nom::Err<nom::error::Error<&[u8]>>| Error::InvalidFormat)?; | ||
|
|
||
| Ok(decoded_proof) | ||
| } |
There was a problem hiding this comment.
I would prefer to revert this change. SCALE-decoding a proof takes in the order of nanoseconds. Using a Vec<u8> and decoding it on the fly is most probably faster than using a Vec<Vec<u8>>, and even if it wasn't I much prefer the consistency of having a proof always be a Vec<u8> no matter the context.
light-base/src/sync_service.rs
Outdated
| } | ||
| } | ||
| RequestImpl::MerkleProof { .. } => { | ||
| // The proof was decoded successfully, it can't be invalid now |
There was a problem hiding this comment.
decode_to_raw_entries isn't enough to check if a proof is valid, you must use decode_and_verify_proof. Furthermore, you must check that the proof indeed contains the desired key by making sure that DecodedTrieProof::storage_value returns Some.
There was a problem hiding this comment.
Another issue is that unfortunately the proof might contain extra unnecessary entries. For example, a malicious node could send back the content of the runtime in every single proof.
In the case of a storage value query or similar, it's not a problem, as the proof is already in memory, and then we just extract the value we need and throw away the rest immediately. But if we return the proof (and collect proofs in the JSON-RPC service and keep them in memory and send them back to the JSON-RPC client), then it might be problematic.
To solve that, you shouldn't just return the proof that was sent through the networking after having verified it. Instead, you should use a proof_encode::ProofBuilder, then call set_node_value with the requested key, and taking the value from the returned proof. Then, repeatedly call missing_node_values and set_node_value, extracting the node value from the proof and injecting it into the proof builder.
This behavior should be in a minimize_proof function somewhere in the smoldot-lib/trie module.
If this is too much you can also leave this as a TODO and let me fix it later.
There was a problem hiding this comment.
Thank you very much for all your guidance here!
I have pushed a commit with these changes, and it works for all the cases when the leaf node has a storage value. But when the key points to an entry that's empty, the proof is missing the leaf node.
This happens because I'm now calling closest_ancestor_in_proof for the leaf key, because otherwise ProofBuilder::set_node_value panics with a "mismatch between node value partial key and provided key".... This solves the panic because it goes one level up, but then for these cases the resulting proof is missing that leaf node.
I tried investigating why is that happening, but I'm left completely confused as it seems that the indeed the leaf node has a partial key that doesn't match at all with the key that I requested. Maybe I'm missunderstanding something about the format of these proofs?
As an example, querying the proof for 0x26aa394eea5630e07c48ae0c9558cef7b99d880ec681799c0cf30e8886371da94be3ddeeb4578c72d73c903a12c9d3c0ac7a7e96dd3562f7b16dbfc0db669d2facd2489ef5116ce25d5dbd4d4141214c (System.Account of an address that doesn't exist) an RPC sends back the following response:
{
at: 0xe29dbbbbdfc1a5ee976ca1055c2b5b4bf8118257850f5d049ae4dc0387881851
proof: [
0x3f3c059315a45bafc27bca0efd388ac6466fd75d8e19c206df9d1c1e1f4eee076ad8874362e44e90c2dd9380a96879239122f8e781c4f9128dc4237462321575009b970e8afb898c1a7cb69fb2bbfbb9
0x80c1c080c86f8124495f4d3dfc488db4ea766cacff9a90cacc92b6405b352dadbe12276d80093253e33158323d9cfc2f7e915daf94477a4a1bc55f8bff33fd3b53a1afd70280257442a3193ebf64fb8a0019b2d9fc1d97734011eb5a8d2ea414f5ff9a27952b80ff505eb6370860111fc7f902fc2ac74bf8e3fc923ecd11898781541e60239d1e804091f08b40f24b052e0dc92942ef804654ae762e958a331876b0a19ba29fd18f
0x80fafb8064d435721939cc0a5a87be1db95ef268fd36b59f0b0b3070fa9cc55536b1884a80eb137076a56bda8ec6a4c5ccd2d49ce18fde013bfde92c0edb08eb3e228ff0d380355086a014ab17ec66d38b241084e019659323e0e6bc7f9447f46d335eb2f4ac80065fd326e8dfe5ebf2d0af00e772f9f4f9f9caf9bf1ef17d5c78c180610aa95b805d563104a97a6d7e389ae65c28410bcad945a54893be2c87786f6e6f626529cb8041f9903b72e61d7381776966885a2b4e8de7ab468eca732c46bf864e23d21f8d8041c72a96b18548cc01d5e15ac7dff8659c720269530287845263983e88c20e21802c529ad41f7f06da77f583a055be3559c0a06f2f1fdb3fc206f63c73459c90b480fc9db008ed3087d65673190e03dbbc6bd74978f26c65656ea0233f4ed41e5d8280de52449e08fd672ea828d86c0ee5127dfad6dea124994a80149d0da02bbaa821801e3f018b1d3d1da22b61e520d2ac6e5d15e848b9c3b41fdffdd12871682a419c8019115673cf29c4372338eabfb2339e86baeccf6c7ff3b5535fa18ce65639a1a88059ca75bd25912a17cd5451ba8153a42abe2a0a428ff978c62a2dc5151ec45da7
0x80ffff800c56ad4263d3a2922f4bd4b9b1922fe92a61dece77a028f7b29376e6ed4f80c6800393b0db4a4cb7eb2d9a3e7b66afc06ed57bdd5b9aefafefce578008c4cad64c8033b8501198ac2acd9e0049ef314bd623594a7b98d520a4fef94e3f98386a0b8680670604e3a9d86fc90440b506b41429f222c8cce8040afb0ae6c5746cfd4055b1804decf0d894bb3eb42f89eb15cb7084399c6582bffe5f9dd6da7bc9585d2becb880dafe8a43d7a75b6786e159e5a14508c6539a43df3e36bb4b900085b894013ef180bd6d5b066b2de672cf5caa396402e31228526e502b72f729df6672cbdb83433580d256d66fc035aafb5df3f7f232017a1377699718094653e3f9e2dd355e137b8b80953f40e108b56fe44a58b2e9b7175a62fea9302be4aeb003a4d0f372c3869cfb80e419d0e39e8da48346c55c2fdc4568bcb46c67c787fece46b613281d10c7bb16802fa982b48214d4eccc1355adfa877d1e6a72d361d4bb4b4fa19e315224c71cce80fb41fc12a7ba0081897e7fad39f2e7e6c75c704bbe696c0f8c20bb8c9fbec3a180e7fd80693d898e3da16cc5507a7fb011bfab827299897a5e1fd40853acee913f8028edb08799bbb7e4e64812aa159e636f6c87902f4a2b8ada62c9f603893d92f18071895ccffaaeeaa69e7ffa015c4be4f694e83930ed52a299e1081dfc8504ae3c801f21e5e947ec7ce2431b3dba1f0ec74fdef6895d7ea78bbe9b3e7c607f6113ec
0x80ffff800dfb728407db2bcf0746ca2444a6a62488a6b9d5bce06585e160b0e453ee59e98006b215a17d620cddd021f73da548e4e68dbf273af6056afee451990f4fb019d580bc9473107278d193c5be69025f1ed37a19fd641b0783d920dd0d4e16cd0d15c580d4cbd810d03540fb3d515128949a1900d80d0fae3e070ae8990a2063aefb978c80e481fa2708fc510d8a4b991f00f3f95dae08ea8e6940cee96d80c17ca86df133803a550127a7db55b9c3f03f9b82d950d26b672b6c2759311702f7e8a7661d83d080ff4c53e7539d519264d0ea3bb157729077c9674bc35ef4b46dd249c1fe5b0be08057fe19d26166552c68b7151aad61ff95feb1681dc4a5334e750c508ddc0e30638070dce9cd7372c96fbcb876fc368f14802221381e2014f778f5b2528f42a23d78805f2d0c2d4288e15b8dbb2247c01f301bb8573963c04dea764b39be605a01c4fb808cbfbf4488780d2959d063ad1fc7bdaff9c5728638c139eb46788cafa479edab80456e8c9417e2c9e58ee97046aa8f0e2ba5a8e3027107029ad19fc8b2a4374c5280c9a9f232fee525433987e90c9e5782ea2162d4ee44a0f3a83bb209cb333d69218061873301735cee551878e324fe01edf6e22f13629912fc0e2ff73fd8d5410c04805170d73e4a4f2496ea60c2f187129209fc65f95acb75213da979037b5ce158738088c500fe12f7f93626d98cf1c94d1cf3d5881d777f50f016fd6916e4a9ca6ff0
0x80ffff802ecd939856a0c587db5bd4c2496f623a60b14dc1084ec17d27c2ad35143258f18048d7bbd67a61cb37dc851822f4ecb8680b7a5dad5ad4c9a501ccdf8fb140ccf4800dcae9b2ba7c848db622a6a582a27226cd9e405df6a386dfdadb98d7ca54c6178060653fd6ef64581db4f8a38e21f2374a9c5bb13529ebc4050db89c90549511b88022441aec78eb607e4e2206af6b5708c95731bd5d9e4a804e0a9f1a556b447349804b652d0dea9b236984611184a1ad3b14c99f0d1b3c0496228b3e6ccb8688bf16807935251513de0a2260c9257702503fc7bf02f10c67a72458245a8c71bd4bc7908062950b7091033caf39297b648597f39e1c48d988dc357fae1f43df94a2f0d48880158676520c2a28ee31231d543682c382834dbc43eb1fb2b88a81ccb32990519e80cb6582990dc72448439042236dc61640f89352f92bf13d9ffc3ff91312b7107780f26304ac63de31e0d8e81d5bbfd46cefa401415311a887d7ebbd0ba083706b568053c1553e226686bcb6736cd61cf13a7b160cd1e0a9251115fb0649f124619378806f7ab788ed2524a00ee33efea042b9b70383c4a94ed26ac1e7522bf133691e91801c58ea525bdd12a42436d9f997c38f2ae37ee1fb619556228537e68b6a6395d38008521bcc8e6790ec0622ee479c3cbe698564318fea97e6fe779aec104dc6f60b800abad974b7392fe05aab4d763596e4028d2d4ec267a4950e9324c9be2ef26c12
0x80ffff805343e1aac01de90badaa2e96949c89d743343428bd1f7db46c0bfdcc35d0dd3e8021269d4b2ebbbef1089214dcffd0b4986542a678654be92638e13e8f485df6ae8062d5b3f2a442774edfc2ec31580a55e9b7d68bf45ff3bbd6e6bbfb8fe5a7c2f380ae85965de169605c26c795625b8fe189ac38d8374e469b155a11801fb54fd8378054fa3ea6fab8cac7e452ee2abd15260d4b5c1e1b329944cf6a0f53cbded2af8680566ca31796adf71244b6ff6f1b9e97c6c75d685827e85c00f3a27ceefc65b69480b059d7f240bac4d14ef670dac2cc1e2cbafb624eb66bda90a2e5b3288bd23fde806a714f3d90cf07e3e6f9187129fd37ef6d2adc4f5e39d0cb93b8cb4cde77f953802bcaed307595f9f4c2478dbdc8bb5a474418da91754b2e7e13787fa5b40c4886801e23731ce5f46d3a8884240c910b0e3f3a684dcbbd20e789bc15acf202c4f7858048a47cb0124af25d7786013d21752524dacbd831c6b95f74ced1373e899b710880bbe5ca715f6f722f70821fcb562132656b39a4faa56d47cbf914493db7131fdd80cd922306f71c2781a3755d000fdb77ce3e47f402fd7542ece0a0b754337f1e5180ca2c48ca22c45789bd5e4f078ec8c9d46fcbda7dd925bfe350f32de21e2becfb80972ccb64ab92eebfb003bbf8305ee15b64c527e6696a4586c3c955ba47fb66fb80d3a46e3c26c9cf4ca0fc5f06f90112a0d3fb1137c0df1c4d32dee2a54702eb52
0x9eaa394eea5630e07c48ae0c9558cef7399f80fb153cfbbde0e71c214f728eb33583cc321586dee2674f6818bf4690d6e4b48480fc121f4fe15b39f71a2f7dc2357e4b278be739e2aadd1eac1c7a36f08fa195e3505f0e7b9012096b41c4eb3aaf947f6ea4290800004c5f0684a022a34dd8bfa2baaf44f172b710040180e1ca0305a045c9232e05ba5bf3b6a317611987a3bc86ba96eddff7f7f46c889080349746acb0021b078bbf11c01956cb4c3711606e437903c3f848b81ca4f86a2480da6dcdc48a4caab18de9903868182d51f03c9956e77627cde58e88809a1b206d805e2c6d0404bf659ca4f011a9eed5885c01d233ca83c4f726bf14cb3586883f704c5f021aab032aaa6e946ca50ad39ab6660304017c5f09cce9c888469bb1a0dceaa129672ef83426573d0020706f6c6b61646f74
0x9f099d880ec681799c0cf30e8886371da9ffff80d9094dd897cdf419f5830fbb299f47bde973d34a582d366d83e1383de8e309498086735aebdd57b04750dd312ee35068561331f92ea307ee93c71e61e58d0e777e807f705a76a6bafad1d4b0c444a1c0dfba42c6e8c13f0314542c0339b62233c92c80ca3b5f39a6a4577e16d8e8db2e501a9b849d81612120cfb2c13bcbb0311b46238042b284740d1d91a7fb8c14906eb844a106dd1e94d8039a3cd98a2bdd78235ccb80073a6190b630863ec5bb6ff36843d38c7044a5559182bd41a889ecc0927f2c2180683d21a29b3492224781853de4b2d582bc2cb8da9645214e04a7fd38b850eea580c5d9a111b404efa442b3d62a6849bcc36f444138e7161caf0f2efec82df42932806b82ab40bd867cb9fc98e6878a808ff05a14695641e47e1acc3ce5244638887b802adc415fea896b77db7a0beddaa8cc1d69b0544f64d8c71a64530096e4a26a8880bfdb069f3504f91d50fa5a4bc748a89bbe012016d2dfbcb5228eab4f9015cb768099caafa79f4524672128a0949e3c9e9ba1dbb5c8e6c95df895d74484cc8444de8033891e145981f47618c1fbb43ec9fd83697b4332eb7ab855d32e879dce9cc449802ab5b63c936da2c11a87d41f5ec76e005621fd97b2cd42ab5a86c99dd1406c6480f806efd295e8900308f5cfa5efe10a239af428b11f761402232c09d7854538c680e5d44063a63163bb90a228967fe54bdde0e3f617e41236eacb6d5a5ba37ce412
]
}
0x3f3c059315a45bafc27bca0efd388ac6466fd75d8e19c206df9d1c1e1f4eee076ad8874362e44e90c2dd9380a96879239122f8e781c4f9128dc4237462321575009b970e8afb898c1a7cb69fb2bbfbb9 is the leaf node. The header 0x3f3c specifies it's a "Leaf containing a hashed subvalue", and the partial key length is 91 nibbles... but those nibbles don't match at all with the key that I requested in the first place (so ProofBuilder::set_node_value panics)
How should I approach this?
| for entry in proof { | ||
| in_progress_results.insert(entry); | ||
| } |
There was a problem hiding this comment.
If you switch proofs to being a Vec<u8> as per my other review comments, you should collect them in a Vec<Vec<u8>> (a vec of proofs), then decode and remove duplicates right before returning the final result.
Merging proofs should probably be its own function in the smoldot-lib/trie module.
Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>
Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>
|
Would you like me to finish this PR? |
@voliva just left on vacation today and he won't be back until Aug 24. I could be wrong, but my understanding is that he already finished this PR ~2 months ago and that he was waiting for you to provide the final review. That being said, if you want to improve/change things, then please do so! Although, it would be amazing if you could also explain what are the parts that the current implementation got wrong (or that need improving), so that we can all learn a bit 🙏 Thanks a lot! |
|
So this was basically left at this question / roadblock I found while implementing the solution of reconstructing the trie with the entries I get: #2136 (comment) I have worked around this by my own with a commit not in this PR, because I was awaiting your thoughts on that before pushing something that essentially is removing an assertion that was in place, as I'm probably missing context… but with that, it works, I can essentially just push empty leaf entries into the trie while reconstructing it. My original plan was to actually add a new fn to ProofBuilder to add specifically empty nodes, but while trying it out on the existing function I realised removing that assertion was enough. But again, that was temporary awaiting your thoughts. Not sure what's the best way to continue that though - I'm OOO for a while, so I can't really spend much time on it for the next weeks. Maybe @josepot can take over. |
tomaka
left a comment
There was a problem hiding this comment.
Ok, sorry for not reviewing that earlier. The GitHub notifications made me believe that only one line of code had changed since my previous review, which is why I didn't review after.
There's a lot of nit-picks. However one big issue is that the new node_value field of TrieNodeInfo is unfortunately completely wrong. I haven't looked too deeply yet into how to do that more properly.
|
I've pushed some commits and I believe that this PR should now be ready. It would be nice to maybe review my changes or test them, but we could also merge without that. Work time for me: 2h |
voliva
left a comment
There was a problem hiding this comment.
Hey @tomaka sorry I was away for a while and then got to work with the PBA.
I have tried this locally and it works correctly for all cases: leaf key and subtree read proofs as well as for any proof of entries with missing values 🚀
Thank you very much!
This implements the legacy method
state_getReadProof. In PAPI we want to add an API to retrieve the storage proofs, but currently smoldot doesn't support it.This might need a proper spec as part of the JSON-RPC interface spec, we're thinking on starting a discussion there for a potential chainHead_v2 with this and other improvements, but this PR would let us ship that feature with smoldot support without waiting for this to be spec-ed out.
So far I have tried the happy path and it seems to work properly. I need some guidance though, as smoldot code is quite new for me. The following points raised doubts on how should it be done so that it follows the same style as the rest of the code:
serde::Serialize.MultiStageRequest, but everything is done in the same step. Maybe it should be broken down in two steps? (one to fetch the proof, the other to send the result?)storage_queryinsync-service, so maybe that should be moved there?I'll appreciate any guidance on this matter 🙏