Skip to content

Conversation

@aakoshh
Copy link
Collaborator

@aakoshh aakoshh commented Mar 21, 2025

This fixes in inexplicable bug I found while working on AztecProtocol/aztec-packages#12841

AztecProtocol/aztec-packages#12841 (comment) demonstrated that a vector of enums can be printed in the convert extension method added to msgpack::object to handle std::vector, showing all items as maps, but during the iteration of the items they change from being MAP to NIL.

AztecProtocol/aztec-packages#12841 (comment) narrowed it down to being due to v.resize(o.via.array.size);, which I could only explain by v overwriting the memory of o.via.array itself, which makes no sense to me, but what else can I conclude. It's weird because there are other instances of std::vector in the ACIR model of the PR with structs, instead of maps.

For example this seems fine:

struct Program {
    std::vector<Acir::Circuit> functions;

but this is not:

struct Circuit {
    uint32_t current_witness_index;
    std::vector<Acir::Opcode> opcodes;

where Acir::Opcode is a single element map keyed by the name of the type, with the value being heterogenous.

The fix I found was to use push_back if possible.

I expected that maps would cause a similar problem, which I'll know when I switch WitnessWitness to use msgpack as well, but it didn't.

@fcarreiro
Copy link

In principle making any changes to this repo (which I didn't know existed) would make it difficult to move to newer versions/merge upstream changes, so this shouldn't be merged lightly. I'll review in a bit but are you sure this is a bug? Has it been reported or fixed upstream?

@aakoshh
Copy link
Collaborator Author

aakoshh commented Mar 21, 2025

@fcarreiro this is the repo used in https://github.com/AztecProtocol/aztec-packages/blob/master/barretenberg/cpp/cmake/msgpack.cmake ; I doubt that we would be moving to a different version in the upstream, given the changes Adam had to do to make this codebase work with Wasm (throwing errors had to be changed into macros doing abort based on compilation flags available in the Barretenberg build). We can still pull updates.

I am very surprised by this bug, I have no explanation to why it happens. That piece of code has not changed for the last 10 years. All I know is based on my experiments, that single resize was responsible for overwriting my data with default values. To me it looks like if this can happen, than all bets are off, so I wouldn't try to upstream it without further investigation by someone better versed in C++ than me.

@TomAFrench
Copy link
Member

I don't think I'm best placed to determine whether this is necessary. I'm tempted to say that it's unlikely that we're the first ones to hit this edgecase but I can't really fault the changes in aztec-packages.

@aakoshh
Copy link
Collaborator Author

aakoshh commented Mar 21, 2025

FWIW I added hello_world.json and witness.gz files which should contain msgpack encoded data to the PR in aztec-packages. With that testing bb should be simpler, without requiring to run nargo again, so instead of this:

export BACKEND=$(pwd)/barretenberg/cpp/build/bin/bb
export NOIR_SERIALIZATION_FORMAT=msgpack
noir/noir-repo/examples/prove_and_verify/test.sh

It should be just:

barretenberg/cpp/build/bin/bb prove -b hello_world.json -w witness.gz -o ./proofs

@TomAFrench
Copy link
Member

TomAFrench commented Mar 21, 2025

As a quick sanity check, I can decode the msgpack serialization of the noir template project to json with the below CLI command, so it's not that we're generating garbled data on the noir side.

$ jq -r '.bytecode' ./target/temp.json | base64 --decode | gunzip | tail -c +2 | msgpack-cli decode
{"functions":[{"assert_messages":[],"current_witness_index":2,"expression_width":{"Bounded":{"width":4}},"opcodes":[{"BrilligCall":{"id":0,"inputs":[{"Single":{"linear_combinations":[["0000000000000000000000000000000000000000000000000000000000000001",0],["30644e72e131a029b85045b68181585d2833e84879b9709143e1f593f0000000",1]],"mul_terms":[],"q_c":"0000000000000000000000000000000000000000000000000000000000000000"}}],"outputs":[{"Simple":2}],"predicate":null}},{"AssertZero":{"linear_combinations":[],"mul_terms":[["0000000000000000000000000000000000000000000000000000000000000001",0,2],["30644e72e131a029b85045b68181585d2833e84879b9709143e1f593f0000000",1,2]],"q_c":"30644e72e131a029b85045b68181585d2833e84879b9709143e1f593f0000000"}}],"private_parameters":[0],"public_parameters":[1],"return_values":[]}],"unconstrained_functions":[{"bytecode":[{"Const":{"bit_size":{"Integer":"U32"},"destination":{"Direct":21},"value":"0000000000000000000000000000000000000000000000000000000000000001"}},{"Const":{"bit_size":{"Integer":"U32"},"destination":{"Direct":20},"value":"0000000000000000000000000000000000000000000000000000000000000000"}},{"CalldataCopy":{"destination_address":{"Direct":0},"offset_address":{"Direct":20},"size_address":{"Direct":21}}},{"Const":{"bit_size":"Field","destination":{"Direct":2},"value":"0000000000000000000000000000000000000000000000000000000000000000"}},{"BinaryFieldOp":{"destination":{"Direct":3},"lhs":{"Direct":0},"op":"Equals","rhs":{"Direct":2}}},{"JumpIf":{"condition":{"Direct":3},"location":8}},{"Const":{"bit_size":"Field","destination":{"Direct":1},"value":"0000000000000000000000000000000000000000000000000000000000000001"}},{"BinaryFieldOp":{"destination":{"Direct":0},"lhs":{"Direct":1},"op":"Div","rhs":{"Direct":0}}},{"Stop":{"return_data":{"pointer":{"Direct":20},"size":{"Direct":21}}}}]}]}

@aakoshh
Copy link
Collaborator Author

aakoshh commented Mar 24, 2025

I ran into another one, and this time I'm out of ideas: AztecProtocol/aztec-packages#12841 (comment)

@aakoshh
Copy link
Collaborator Author

aakoshh commented Mar 25, 2025

I figured it out, it was a dangling pointer, fixed in AztecProtocol/aztec-packages@5810e3b

@aakoshh aakoshh closed this Mar 25, 2025
@aakoshh
Copy link
Collaborator Author

aakoshh commented Mar 26, 2025

Notably the dangling pointer pattern appears in multiple places in aztec-packages, e.g. here and even the docs.

@fcarreiro
Copy link

fcarreiro commented Mar 26, 2025

msgpack::unpack((const char*)encoded_data, encoded_data_size).get().convert(*value);

This one (in the docs) I think is fine. Let me explain why.

  1. msgpack::unpack((const char*)encoded_data, encoded_data_size) returns your handle, the one object that you DONT want to be deallocated.
  2. .get() returns a reference, which is the one that ends up being dangling if the handle gets deallocated
  3. .convert() uses that reference

however, because of the way this is written, the handle only gets deallocated AFTER this whole expression is used. So things are ok.

If instead you did

auto sth = msgpack::unpack((const char*)encoded_data, encoded_data_size).get();
auto x = sth.convert(*value);

then you are in trouble. Because the handle gets deallocated after the sth line, you cannot use the reference anymore.

PS: I have to check that things work like this, just my intuition says this is the explanation.

@aakoshh
Copy link
Collaborator Author

aakoshh commented Mar 26, 2025

Thanks, that sounds reasonable to me 👍 I'd expect the problem would have manifested itself already somehow if it wasn't right.

@fcarreiro
Copy link

https://www.programiz.com/online-compiler/493yYnMXkiR55

X constructed
getting i
X destroyed
7
now lets break it
X constructed
X destroyed
getting i
7

the last 7 may or may not work

aakoshh added a commit to AztecProtocol/aztec-packages that referenced this pull request Mar 27, 2025
Adds `msgpack` serialisation to the generated Acir and Witness C++ code.

I moved the alterations described in `dsl/README.md` into the code
generation itself, so no manual work is required. The PR is running
against a feature branch with the same name in Noir, here's the upstream
PR: noir-lang/noir#7716

With this PR is merged, `bb` should be able to handle `msgpack` or
`bincode`. Once that's released we can switch to using `msgpack` in Noir
in both native and wasm by merging
noir-lang/noir#7810. And then we can remove the
`msgpack` format detection and the fallback to `bincode`.

**TODO**:
- [x] Get it to compile 
- [x] Change `nargo` to allow compiling contracts with `msgpack` format:
added `NOIR_SERIALIZATION_FORMAT` env var
- [x] Add a first byte to the data to say which serialization format it
is. There is a chance that it would clash with existing bincode data
though, so a fallback would anyway be necessary. (Or we should ascertain
that bincode never starts with some specific bit sequence).
- [x] ~Change the `bb` code so it tries `bincode`, then falls back to
`msgpack` - this way the currently used format stays fast, but we can
feed it new data.~ _This looks problematic, as exceptions in the wasm
build is disabled in `arch.cmake` and `throw_or_abort` aborts in wasm.
Instead we run
[msgpack::parse](https://c.msgpack.org/cpp/namespacemsgpack.html#ad844d148ad1ff6c9193b02529fe32968)
first to check if the data looks like msgpack; if not, we use bincode._
- [x] Run integration tests with `msgpack` on both sides in
#13021
- [x] Ignore the Brillig opcodes in Barretenberg
- [x] Change the serialization of `WitnessStack` and `WitnessMap` to use
the env var, add fallbacks in `bb` for them
- [x] Revert the change to `noir-repo-ref` before merging


### Use of `MSGPACK_FIELDS`

The generated code is using `MSGPACK_FIELDS` for structs, to keep it
more terse.

At some point during debugging the memory issue below I changed it so
that I can have more direct control by generating code for individual
fields. That needed some helper functions which I looted from the
`msgpack-c` library and injected into the namespaces as a `Helpers`
struct. This approach might be useful if we wanted to have extra checks,
for example rejecting the data if there are extra fields, indicating a
type has been extended with things we don't recognise, or if we wanted
handle renamed fields. I left it out so there is less code to maintain,
but if we need it we can recover it from the [commit
history](noir-lang/noir@b0a612d).

### Compile `nargo` with the `msgpack` feature

```bash
echo af/msgpack-codegen > noir/noir-repo-ref
noir/bootstrap.sh
```

### Generate and compile C++ code

```bash
cd noir/noir-repo && NOIR_CODEGEN_OVERWRITE=1 cargo test -p acir cpp_codegen && cd -
cp noir/noir-repo/acvm-repo/acir/codegen/acir.cpp barretenberg/cpp/src/barretenberg/dsl/acir_format/serde/acir.hpp
cp noir/noir-repo/acvm-repo/acir/codegen/witness.cpp barretenberg/cpp/src/barretenberg/dsl/acir_format/serde/witness_stack.hpp
cd barretenberg/cpp && ./format.sh changed && cd -

barretenberg/cpp/bootstrap.sh
```

### Test `nargo` with `bb`

One example of an integration test that uses `bb` and noir in the Noir
repo is
https://github.com/noir-lang/noir/actions/runs/13631231158/job/38099477964

We can call it like this:

```bash
cd noir/noir-repo && cargo install --path tooling/nargo_cli && cd -
./barretenberg/cpp/bootstrap.sh
export BACKEND=$(pwd)/barretenberg/cpp/build/bin/bb
export NOIR_SERIALIZATION_FORMAT=msgpack
noir/noir-repo/examples/prove_and_verify/test.sh
```

If it works, it should print this:
```console
% unset NOIR_SERIALIZATION_FORMAT                       
% noir/noir-repo/examples/prove_and_verify/test.sh      
[hello_world] Circuit witness successfully solved
[hello_world] Witness saved to /mnt/user-data/akosh/aztec-packages/noir/noir-repo/examples/prove_and_verify/target/witness.gz
Finalized circuit size: 18
Proof saved to "./proofs/proof"
Finalized circuit size: 18
VK saved to "./target/vk"
Proof verified successfully
```

Whereas if it doesn't:
```console
% export NOIR_SERIALIZATION_FORMAT=msgpack                                                                                                                                                            
% noir/noir-repo/examples/prove_and_verify/test.sh             
[hello_world] Circuit witness successfully solved
[hello_world] Witness saved to /mnt/user-data/akosh/aztec-packages/noir/noir-repo/examples/prove_and_verify/target/witness.gz
Length is too large
```

I attached the final artefacts to the PR so it's easier to test with
just `bb`.

[hello_world.json](https://github.com/user-attachments/files/19391072/hello_world.json)

[witness.gz](https://github.com/user-attachments/files/19391074/witness.gz)


### Further testing

With the `noir-repo-ref` pointing at the feature `af/msgpack-codegen`
feature branch, we can run all the contract compilations and tests with
`msgpack` as follows:

```shell
export NOIR_SERIALIZATION_FORMAT=msgpack       
./bootstrap.sh ci
```

This is tested in
#13021

### Peek into artefacts

We can inspect the file in JSON format using
[this](https://crates.io/crates/msgpack-cli) msgpack CLI tool.

```shell
jq -r '.bytecode' ./target/program.json | base64 --decode | gunzip | tail -c +2 | mpk --to-json | jq
```

Thanks Tom for the
[spell](AztecProtocol/msgpack-c#5 (comment))
🙏

### False bug

At some point I thought had to make some fixes in `msgpack-c` itself to
make this work: AztecProtocol/msgpack-c#5
A similar [blocking
bug](#12841 (comment))
was encountered when running the entire `ci` build with msgpack format.

It turned out it was a [dangling
pointer](msgpack/msgpack-c#695 (comment))
issue, fixed in
5810e3b

Much of the comments below are related to my struggles that came from
this mistake; you can ignore them.
AztecBot pushed a commit to AztecProtocol/barretenberg that referenced this pull request Mar 28, 2025
Adds `msgpack` serialisation to the generated Acir and Witness C++ code.

I moved the alterations described in `dsl/README.md` into the code
generation itself, so no manual work is required. The PR is running
against a feature branch with the same name in Noir, here's the upstream
PR: noir-lang/noir#7716

With this PR is merged, `bb` should be able to handle `msgpack` or
`bincode`. Once that's released we can switch to using `msgpack` in Noir
in both native and wasm by merging
noir-lang/noir#7810. And then we can remove the
`msgpack` format detection and the fallback to `bincode`.

**TODO**:
- [x] Get it to compile 
- [x] Change `nargo` to allow compiling contracts with `msgpack` format:
added `NOIR_SERIALIZATION_FORMAT` env var
- [x] Add a first byte to the data to say which serialization format it
is. There is a chance that it would clash with existing bincode data
though, so a fallback would anyway be necessary. (Or we should ascertain
that bincode never starts with some specific bit sequence).
- [x] ~Change the `bb` code so it tries `bincode`, then falls back to
`msgpack` - this way the currently used format stays fast, but we can
feed it new data.~ _This looks problematic, as exceptions in the wasm
build is disabled in `arch.cmake` and `throw_or_abort` aborts in wasm.
Instead we run
[msgpack::parse](https://c.msgpack.org/cpp/namespacemsgpack.html#ad844d148ad1ff6c9193b02529fe32968)
first to check if the data looks like msgpack; if not, we use bincode._
- [x] Run integration tests with `msgpack` on both sides in
AztecProtocol/aztec-packages#13021
- [x] Ignore the Brillig opcodes in Barretenberg
- [x] Change the serialization of `WitnessStack` and `WitnessMap` to use
the env var, add fallbacks in `bb` for them
- [x] Revert the change to `noir-repo-ref` before merging


### Use of `MSGPACK_FIELDS`

The generated code is using `MSGPACK_FIELDS` for structs, to keep it
more terse.

At some point during debugging the memory issue below I changed it so
that I can have more direct control by generating code for individual
fields. That needed some helper functions which I looted from the
`msgpack-c` library and injected into the namespaces as a `Helpers`
struct. This approach might be useful if we wanted to have extra checks,
for example rejecting the data if there are extra fields, indicating a
type has been extended with things we don't recognise, or if we wanted
handle renamed fields. I left it out so there is less code to maintain,
but if we need it we can recover it from the [commit
history](noir-lang/noir@b0a612d).

### Compile `nargo` with the `msgpack` feature

```bash
echo af/msgpack-codegen > noir/noir-repo-ref
noir/bootstrap.sh
```

### Generate and compile C++ code

```bash
cd noir/noir-repo && NOIR_CODEGEN_OVERWRITE=1 cargo test -p acir cpp_codegen && cd -
cp noir/noir-repo/acvm-repo/acir/codegen/acir.cpp barretenberg/cpp/src/barretenberg/dsl/acir_format/serde/acir.hpp
cp noir/noir-repo/acvm-repo/acir/codegen/witness.cpp barretenberg/cpp/src/barretenberg/dsl/acir_format/serde/witness_stack.hpp
cd barretenberg/cpp && ./format.sh changed && cd -

barretenberg/cpp/bootstrap.sh
```

### Test `nargo` with `bb`

One example of an integration test that uses `bb` and noir in the Noir
repo is
https://github.com/noir-lang/noir/actions/runs/13631231158/job/38099477964

We can call it like this:

```bash
cd noir/noir-repo && cargo install --path tooling/nargo_cli && cd -
./barretenberg/cpp/bootstrap.sh
export BACKEND=$(pwd)/barretenberg/cpp/build/bin/bb
export NOIR_SERIALIZATION_FORMAT=msgpack
noir/noir-repo/examples/prove_and_verify/test.sh
```

If it works, it should print this:
```console
% unset NOIR_SERIALIZATION_FORMAT                       
% noir/noir-repo/examples/prove_and_verify/test.sh      
[hello_world] Circuit witness successfully solved
[hello_world] Witness saved to /mnt/user-data/akosh/aztec-packages/noir/noir-repo/examples/prove_and_verify/target/witness.gz
Finalized circuit size: 18
Proof saved to "./proofs/proof"
Finalized circuit size: 18
VK saved to "./target/vk"
Proof verified successfully
```

Whereas if it doesn't:
```console
% export NOIR_SERIALIZATION_FORMAT=msgpack                                                                                                                                                            
% noir/noir-repo/examples/prove_and_verify/test.sh             
[hello_world] Circuit witness successfully solved
[hello_world] Witness saved to /mnt/user-data/akosh/aztec-packages/noir/noir-repo/examples/prove_and_verify/target/witness.gz
Length is too large
```

I attached the final artefacts to the PR so it's easier to test with
just `bb`.

[hello_world.json](https://github.com/user-attachments/files/19391072/hello_world.json)

[witness.gz](https://github.com/user-attachments/files/19391074/witness.gz)


### Further testing

With the `noir-repo-ref` pointing at the feature `af/msgpack-codegen`
feature branch, we can run all the contract compilations and tests with
`msgpack` as follows:

```shell
export NOIR_SERIALIZATION_FORMAT=msgpack       
./bootstrap.sh ci
```

This is tested in
AztecProtocol/aztec-packages#13021

### Peek into artefacts

We can inspect the file in JSON format using
[this](https://crates.io/crates/msgpack-cli) msgpack CLI tool.

```shell
jq -r '.bytecode' ./target/program.json | base64 --decode | gunzip | tail -c +2 | mpk --to-json | jq
```

Thanks Tom for the
[spell](AztecProtocol/msgpack-c#5 (comment))
🙏

### False bug

At some point I thought had to make some fixes in `msgpack-c` itself to
make this work: AztecProtocol/msgpack-c#5
A similar [blocking
bug](AztecProtocol/aztec-packages#12841 (comment))
was encountered when running the entire `ci` build with msgpack format.

It turned out it was a [dangling
pointer](msgpack/msgpack-c#695 (comment))
issue, fixed in
AztecProtocol/aztec-packages@5810e3b

Much of the comments below are related to my struggles that came from
this mistake; you can ignore them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants