On-chain verification for Verifiable Off-chain Data Aggregation #562

Kubuxu · 2022-12-07T14:56:26Z

Kubuxu
Dec 7, 2022
Collaborator

Motivation

The proposal in #512 (soon to be FRC) lays out a pattern allowing aggregators to prove to clients aggregation of a data segment data within a larger data deal. While useful in a standalone setting, it doesn't cover one crucial capability, proving this fact to client contracts or contract 3rd parties.

For this to be achieved, a method for on-chain verification of Proof of Data Segment Inclusion is required.

Requirements

Proof of Data Segment Inclusion (PoDSI) can be verified on-chain by FEVM contracts.
Low overhead for implementation
Implementation continues to be useful in the evolving ecosystem

Discussion

There are several ways to approach this problem and achieve different tradeoffs.

FEVM verification contract

By utilising user deployable contracts, a PoDSI verification contract can be deployed. It would utilise a SHA256 precompile to verify the two inclusion proofs (30-60 SHA256 calls). Additionally, it would call into native actors to verify the deal status, its size and commitment.

This solution has the lowest long-term system overhead, no new built-in contract code is added, and no system adaptations are necessary. Its drawback is the high expected gas execution cost due to the overhead of FEVM and SHA256 precompile and syscalls.
It also requires that implementation of the PoDIS verification routine is built in Solidity, which would be used only until the deployment of native FVM contracts is enabled.

Built-in verification actor

Built-in actors run on native WASM-based FVM, which is not user deployable currently and won't be for nv18 and nv19.
A verification contract for PoDIS could be deployed either as a native actor or as a utility method on an existing contract (maybe DealMarket).
This would result in significantly reduced verification overhead, and would continue to be useful whenever native FVM contracts become user deployable. While ideally, this functionality wouldn't live within built-in actors, this is currently the only way to deploy native FVM code.

This approach has higher maintenance overhead because the code, at some point, should be moved out of built-in actors. On the other hand, it avoids the requirement of verification being implemented in Solidity and later discarded.

Either of the two with syscall acceleration for Merkle Inclusion Proofs

Merkle Inclusion Proofs currently are one of the few universal proofs used across numerous protocols within and outside of the Filecoin ecosystem. While technologies beyond them (for example, algebraic Vector Commitments) are showing up on the horizon, they are still in the early stages of development or deployment. Merkle Inclusion Proofs will be a valuable utility for years to come due to all the existing systems using them.

Format and verification of end-to-end Merkle proofs (from Merkle root to piece of data of interest) is tough subject and was unsuccessfully attempted in the past (ICS23) by other groups, majority of the complexity exists to the unique structures of the Merkle Tree for different use cases and additional constraints that have to be expressed.

To avoid these issues, we propose not a "Inclusion Proof interchange format" but a syscall accelerating the verification of Merkle proofs. This allows the problem to be narrowed down to chained hashing with layer-dependent prefixes and suffixes. Verification of other constraints, like path, key, alignment and non-existence proofs, would be left to the contract code.

A syscall accelerating Merkle Inclusion Proof verification would significantly reduce the overhead for verification of PoDSI due to a reduction in executed syscalls (2 verification calls instead of 30-60 hash calls).

cc @anorth @raulk @Stebalien for feedback

anorth · 2022-12-11T22:44:41Z

anorth
Dec 11, 2022
Maintainer

Thanks for raising this @Kubuxu, here are my current thoughts.

FEVM verification
This is essentially the do-nothing approach from the protocol's point of view. PoDSI verification can be implemented in Solidity (or any other future VM) contracts, therefore the protocol doesn't need to do anything. However, it's slow/expensive.

Built-in actor
PoDSI verification can be implemented in native actors much more efficiently. If we were releasing user-programmability on the native FVM first, I don't think we'd have nearly as much of a discussion here. The only reason to consider implementing a built-in actor is because the place where user-programability will be first offered is expected to be much less efficient. A built-in actor would become redundant as soon as user-deployable native actors are possible. Implementing a built-in actor slightly increases the friction of future built-in actor development, and is the opposite direction to our long term goal of reducing the built-in scope to truly network-critical things.

Syscall acceleration
While a built-in actor would probably be fast enough, I gather there's still plenty to be gained from a syscall. And this gain is permanent, unlike the gain from implementing a built-in, which will later go to zero. Like a built-in, each new syscall slightly increases friction of future FVM development, although unlike the built-in actors I think we have firm expectations that this surface area will expand somewhat in the future.

I can easily support do-nothing as the most conservative approach, and wait for more data from real-world development and usage to motivate protocol-level change. However, I could also be reasonably convinced that the benefits of a syscall to accelerate validation of Merkle inclusion proofs are worthwhile in the long term. A syscall for something non-primitive like this, that can be implemented in terms of other syscalls, is a fine line. But I would prefer this option to adding a built-in actor.

0 replies

momack2 · 2022-12-13T14:30:48Z

momack2
Dec 13, 2022
Maintainer

I would +1 starting with FEVM verification and exploring the more "accelerated" options only if we see significant utilization and UX pain from FEVM gas costs. With M2.2 targeted for midyear, FEVM verification could easily be a sufficient placeholder for those first 6 months, to then be replaced by a more optimized wasm actor in M2.2. IIUC optimization via syscall would also benefit from seeing real-world use cases to assess how they they align with/benefit from the proposed Merkle proof constraints.

0 replies

raulk · 2023-01-17T11:39:53Z

raulk
Jan 17, 2023

Re: acceleration roadmap. Builtin-actor followed by syscall acceleration is a little bit circuitous IMO. I would implement a generic syscall that is capable of efficiently verifying any merkle inclusion proof in a single go (user provides the full input data laid out linerarly with some metadata demarcating the boundaries of each level to digest + a multihash function, and the syscall streams over the data). It can then be used for PoDSI, or any other merkle inclusion proof construction. To access it from Filecoin EVM smart contracts, I would just provide a Filecoin precompile, just like we do for other Filecoin-specific syscalls.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

On-chain verification for Verifiable Off-chain Data Aggregation #562

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

On-chain verification for Verifiable Off-chain Data Aggregation #562

Uh oh!

Kubuxu Dec 7, 2022 Collaborator

Motivation

Requirements

Discussion

FEVM verification contract

Built-in verification actor

Either of the two with syscall acceleration for Merkle Inclusion Proofs

Replies: 3 comments

Uh oh!

anorth Dec 11, 2022 Maintainer

Uh oh!

momack2 Dec 13, 2022 Maintainer

Uh oh!

raulk Jan 17, 2023

Kubuxu
Dec 7, 2022
Collaborator

anorth
Dec 11, 2022
Maintainer

momack2
Dec 13, 2022
Maintainer

raulk
Jan 17, 2023