Skip to content

Conversation

@ecioppettini
Copy link
Contributor

Adds an example of a toy vm that has a sort opcode. For that sort there is an optimization where the sort is done out of trace, and the result is verified in the trace, but only when running on starstream.

For really small vectors the lookup argument it's actually worse, since the verification is more expensive than actually sorting. But for vectors of 500 elements it's already a significant speedup (almost half the trace size).

NOTE: It's not actually possible to run the nebula side of this right now, because of these:

ICME-Lab/zkEngine_dev#47
ICME-Lab/zkEngine_dev#48

Although I could test it by patching those locally (not necessarily with the proper patch though)

NOTES for reviewing:

  1. The vm is a stack machine, no registers. It's kind of similar in structure to Midnight's onchain vm (but much simpler). The part that is more relevant for the starstream side is in the Op::Sort implementation.
  2. The allocator is passed explicitly as a parameter everywhere. This adds quite a bit of boilerplate, since I couldn't use derive that much because of it. But I think that would be an ideal scenario for integrating a vm.
  3. I didn't add the encoding of the opcodes, because there is no way of passing a vector to a coordination script anyway. But it could be a next step. Instead the program is embedded in the utxo.

TODO/QUESTION:

The unconstrained function currently doesn't check memory bounds.

Right now, the rust code checks that the resulting array is a sort of the original. The problem is that it's not really viable to test that the function didn't randomly modify other sections of memory. This is obviously unsound, so there has to be a way of constraining the function to a certain memory range.

I think there are generally two options:

  1. Instead of calling the function on the same instance, we create a new instance. This seems to work, but it may also lead to weird bugs. For example, if the function needs to access any statics, or it does allocate something. But I guess as long as it's clear that it's a sandboxed memory space it may be fine. The other issue is that it may be more complicated in terms of the api. For example, if you need to pass a type that uses pointers inside you may need to serialize it.
  2. Run the unconstrained function with instruction tracing. And then check the trace and assert over the range of memory it writes to. The problem of this is that I don't really know how to prove that it's done correctly. It may also be hard to do if you have types with pointers inside, since then figuring out the range is kind of hard.

@ecioppettini ecioppettini requested a review from SpaceManiac June 9, 2025 15:32
@ecioppettini ecioppettini self-assigned this Jun 9, 2025
@ecioppettini ecioppettini force-pushed the toy-vm-with-lookup-argument branch from e91a78c to 407250c Compare June 9, 2025 15:37
@SpaceManiac SpaceManiac changed the title add toy_vm with sort lookup argument example Add toy_vm with sort lookup argument example Jun 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants