-
Notifications
You must be signed in to change notification settings - Fork 579
feat: msgpack encoding for Program and WitnessStack
#12841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3e6e25b to
aeefaf3
Compare
barretenberg/cpp/src/barretenberg/dsl/acir_format/serde/witness_stack.hpp
Show resolved
Hide resolved
|
For the record here's the msgpack bytecode without the format marker encoded as base64: It can be pasted into https://ref45638.github.io/msgpack-converter/ and parsed into correct looking JSON. |
|
Instead of trying to implement namespace msgpack {
namespace adaptor {
// For Opcode
template <> struct msgpack::adaptor::convert<Acir::Opcode> {
msgpack::object const& operator()(msgpack::object const& o, Acir::Opcode& v) const
{
if (o.type != msgpack::type::MAP || o.via.map.size != 1) {
throw_or_abort("expecteed signle element map for 'Opcode'");
}
auto& kv = o.via.map.ptr[0];
std::string key = kv.key.as<std::string>();
if (key == "BrilligCall") {
Acir::Opcode::BrilligCall bc = kv.val.as<Acir::Opcode::BrilligCall>();
v.value = bc;
} else if (key == "AssertZero") {
Acir::Opcode::AssertZero az = kv.val.as<Acir::Opcode::AssertZero>();
v.value = az;
} else {
throw_or_abort("unknown tag for 'Opcode': " + key);
}
return o;
}
};
} // namespace adaptor
} // namespace msgpack
This looks like an extension method that will make The problem is that |
|
Unfortunately all of the following variants result in with templating: taking an implicit object and casting: taking an object reference: 😞 |
|
My hypotheses is I modified that file at template <typename T, typename Alloc>
struct convert<std::vector<T, Alloc> > {
msgpack::object const& operator()(msgpack::object const& o, std::vector<T, Alloc>& v) const {
std::cerr << "reading array: " << o << std::endl;
if (o.type != msgpack::type::ARRAY) { THROW msgpack::type_error(); }
v.resize(o.via.array.size);
if (o.via.array.size > 0) {
msgpack::object* p = o.via.array.ptr;
msgpack::object* const pend = o.via.array.ptr + o.via.array.size;
typename std::vector<T, Alloc>::iterator it = v.begin();
do {
std::cerr << "item: " << *p << std::endl;
p->convert(*it);
++p;
++it;
} while(p < pend);
}
return o;
}
};It prints this during deserialisation: Not sure what's wrong with it. To be clear, the What I don't understand is how can it print the whole array, and then when looping through the items see them as NIL. I bit further investigation reveals that the type of the objects in the array changes between the printing and the processing: So the pointer address is the same 0x55a1aa246c00, but the |
|
Further weirdness: template <typename T, typename Alloc>
struct convert<std::vector<T, Alloc> > {
msgpack::object const& operator()(msgpack::object const& o, std::vector<T, Alloc>& v) const {
if (o.type != msgpack::type::ARRAY) { THROW msgpack::type_error(); }
std::cerr << "unpacking array 1: " << o << std::endl;
std::cerr << "unpacking array 2: " << o << std::endl;
v.resize(o.via.array.size);
std::cerr << "unpacking array 3: " << o << std::endl;
if (o.via.array.size > 0) {
std::cerr << "unpacking array 4: " << o << std::endl;
msgpack::object* p = o.via.array.ptr;
msgpack::object* const pend = o.via.array.ptr + o.via.array.size;
typename std::vector<T, Alloc>::iterator it = v.begin();
do {
p->convert(*it);
++p;
++it;
} while(p < pend);
}
return o;
}
};Prints: It's almost as if resizing |
|
This version works: template <typename T, typename Alloc>
struct convert<std::vector<T, Alloc> > {
msgpack::object const& operator()(msgpack::object const& o, std::vector<T, Alloc>& v) const {
if (o.type != msgpack::type::ARRAY) { THROW msgpack::type_error(); }
bool target_size = v.size();
if (target_size <= o.via.array.size) {
v.reserve(o.via.array.size);
} else {
v.resize(o.via.array.size);
}
if (o.via.array.size > 0) {
msgpack::object* p = o.via.array.ptr;
msgpack::object* const pend = o.via.array.ptr + o.via.array.size;
size_t i = 0;
do {
if (i >= target_size) {
T item;
p->convert(item);
v.push_back(item);
} else {
p->convert(v[i]);
}
++i;
++p;
} while(p < pend);
}
return o;
}
}; |
msgpack encoding for Program and WitnessStack
187e485 to
31f6ce5
Compare
31f6ce5 to
d39ebcf
Compare
|
@ludamad suggested that I should have used the Address Sanitizer when I hit weird behaviour. I changed the code back to cmake --preset asan && cmake --build --preset asanThis did not work, seemingly for the same reason as here: % cmake --preset asan && cmake --build --preset asan 8s ~/aztec-packages/barretenberg/cpp af/msgpack-codegen+ akosh-box
Preset CMake variables:
DISABLE_ASM="ON"
ENABLE_ASAN="ON"
TARGET_ARCH="skylake"
...
-- Build type: Debug
...
-- Build files have been written to: /mnt/user-data/akosh/aztec-packages/barretenberg/cpp/build-asan
[1/5] Linking CXX executable bin/ultra_vanilla_client_ivc_tests
FAILED: bin/ultra_vanilla_client_ivc_tests src/barretenberg/ultra_vanilla_client_ivc/ultra_vanilla_client_ivc_tests[1]_tests.cmake /mnt/user-data/akosh/aztec-packages/barretenberg/cpp/build-asan/src/barretenberg/ultra_vanilla_client_ivc/ultra_vanilla_client_ivc_tests[1]_tests.cmake
: && /usr/bin/clang++-16 -gdwarf-4 -g -gdwarf-4 -fsanitize=address -pthread src/barretenberg/ultra_vanilla_client_ivc/CMakeFiles/ultra_vanilla_client_ivc_test_objects.dir/ultra_vanilla_client_ivc.test.cpp.o -o bin/ultra_vanilla_client_ivc_tests lib/libultra_vanilla_client_ivc.a lib/libstdlib_honk_verifier.a lib/libgtest.a lib/libgtest_main.a lib/libgmock_main.a lib/libstdlib_transcript.a lib/libstdlib_poseidon2.a lib/libstdlib_circuit_builders.a lib/libtrace_to_polynomials.a lib/libstdlib_primitives.a lib/libplonk_honk_shared.a lib/libcircuit_checker.a lib/libultra_honk.a lib/libsumcheck.a lib/libstdlib_keccak.a lib/libstdlib_sha256.a lib/libstdlib_circuit_builders.a lib/libtrace_to_polynomials.a lib/libstdlib_primitives.a lib/libplonk_honk_shared.a lib/libcircuit_checker.a lib/libultra_honk.a lib/libsumcheck.a lib/libstdlib_keccak.a lib/libstdlib_sha256.a lib/librelations.a lib/libcrypto_pedersen_hash.a lib/libcrypto_pedersen_commitment.a lib/libcommitment_schemes.a lib/libsrs.a lib/libpolynomials.a lib/libtranscript.a lib/libcrypto_poseidon2.a lib/libecc.a lib/libnumeric.a lib/libcommon.a lib/libenv.a lib/libcrypto_sha256.a lib/libcrypto_keccak.a lib/libgmock.a lib/libgtest.a && cd /mnt/user-data/akosh/aztec-packages/barretenberg/cpp/build-asan/src/barretenberg/ultra_vanilla_client_ivc && /usr/bin/cmake -D TEST_TARGET=ultra_vanilla_client_ivc_tests -D TEST_EXECUTABLE=/mnt/user-data/akosh/aztec-packages/barretenberg/cpp/build-asan/bin/ultra_vanilla_client_ivc_tests -D TEST_EXECUTOR= -D TEST_WORKING_DIR=/mnt/user-data/akosh/aztec-packages/barretenberg/cpp/build-asan -D TEST_EXTRA_ARGS= -D TEST_PROPERTIES= -D TEST_PREFIX= -D TEST_SUFFIX= -D "TEST_FILTER=-*_SKIP_CI*" -D NO_PRETTY_TYPES=FALSE -D NO_PRETTY_VALUES=FALSE -D TEST_LIST=ultra_vanilla_client_ivc_tests_TESTS -D CTEST_FILE=/mnt/user-data/akosh/aztec-packages/barretenberg/cpp/build-asan/src/barretenberg/ultra_vanilla_client_ivc/ultra_vanilla_client_ivc_tests[1]_tests.cmake -D TEST_DISCOVERY_TIMEOUT=5 -D TEST_XML_OUTPUT_DIR= -P /usr/share/cmake-3.28/Modules/GoogleTestAddTests.cmake
/usr/bin/ld: src/barretenberg/ultra_vanilla_client_ivc/CMakeFiles/ultra_vanilla_client_ivc_test_objects.dir/ultra_vanilla_client_ivc.test.cpp.o:(.data.rel.ro._ZTVN2bb13CircuitSourceINS_11UltraFlavorENS_20UltraCircuitBuilder_INS_25UltraExecutionTraceBlocksEEENS1_15VerificationKeyEEE[_ZTVN2bb13CircuitSourceINS_11UltraFlavorENS_20UltraCircuitBuilder_INS_25UltraExecutionTraceBlocksEEENS1_15VerificationKeyEEE]+0x10): undefined reference to `bb::CircuitSource<bb::UltraFlavor, bb::UltraCircuitBuilder_<bb::UltraExecutionTraceBlocks>, bb::UltraFlavor::VerificationKey>::next()'
/usr/bin/ld: src/barretenberg/ultra_vanilla_client_ivc/CMakeFiles/ultra_vanilla_client_ivc_test_objects.dir/ultra_vanilla_client_ivc.test.cpp.o:(.data.rel.ro._ZTVN2bb13CircuitSourceINS_11UltraFlavorENS_20UltraCircuitBuilder_INS_25UltraExecutionTraceBlocksEEENS1_15VerificationKeyEEE[_ZTVN2bb13CircuitSourceINS_11UltraFlavorENS_20UltraCircuitBuilder_INS_25UltraExecutionTraceBlocksEEENS1_15VerificationKeyEEE]+0x18): undefined reference to `bb::CircuitSource<bb::UltraFlavor, bb::UltraCircuitBuilder_<bb::UltraExecutionTraceBlocks>, bb::UltraFlavor::VerificationKey>::num_circuits() const'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
[4/5] Linking CXX executable bin/dsl_tests
ninja: build stopped: subcommand failed.Thinking that the level of optimisation needs to be higher, I changed With this change the above error disappeared, but after a 45 minute build, there was nothing to point out the errors of my ways: % cmake --preset asan && cmake --build --preset asan
...
-- Configuring done (1.7s)
-- Generating done (0.3s)
-- Build files have been written to: /mnt/user-data/akosh/aztec-packages/barretenberg/cpp/build-asan
[846/859] Building CXX object src/barretenberg/vm2/CMakeFiles/relations_acc_bench_objects.dir/constraining/benchmark/relations_acc.bench.cpp.o
[859/859] Linking CXX executable bin/bb
% 43m 20s ~/aztec-packages/barretenberg/cpp af/msgpack-codegen+ akosh-box
% cmake --preset asan && cmake --build --preset asan ~/aztec-packages/barretenberg/cpp af/msgpack-codegen+ akosh-box
...
-- Build files have been written to: /mnt/user-data/akosh/aztec-packages/barretenberg/cpp/build-asan
ninja: no work to do.Not sure if I'm doing it right. |
TomAFrench
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've got some merge conflicts which have appeared. LGTM once addressed
fcarreiro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments.
It seems like you guys want to merge quickly so I won't block. If there's any problem down the line at least it seems quite localized.
barretenberg/cpp/src/barretenberg/dsl/acir_format/acir_to_constraint_buf.hpp
Outdated
Show resolved
Hide resolved
barretenberg/cpp/src/barretenberg/dsl/acir_format/serde/witness_stack.hpp
Outdated
Show resolved
Hide resolved
barretenberg/cpp/src/barretenberg/dsl/acir_format/serde/witness_stack.hpp
Outdated
Show resolved
Hide resolved
barretenberg/cpp/src/barretenberg/dsl/acir_format/serde/witness_stack.hpp
Outdated
Show resolved
Hide resolved
60a13e8 to
e3df1d1
Compare
🤖 I have created a new Aztec Packages release --- ## [0.82.3](v0.82.2...v0.82.3) (2025-03-27) ### Features * `msgpack` encoding for `Program` and `WitnessStack` ([#12841](#12841)) ([1e58eb1](1e58eb1)) * 64 bit log type id, 64 bit log metadata ([#12956](#12956)) ([20d734a](20d734a)) * AVM parsing tag validation ([#12936](#12936)) ([56b1f0d](56b1f0d)) * **avm:** add calldata & returndata to context ([#13008](#13008)) ([f03b2e5](f03b2e5)) * **avm:** merkle db hints (part 1) ([#12922](#12922)) ([34ec9e8](34ec9e8)) * **avm:** merkle hints (part 2) ([#13077](#13077)) ([fbbc6c7](fbbc6c7)) * **avm:** vm2 initial context ([#12972](#12972)) ([e2b1361](e2b1361)) * benchmark avm simulator ([#12985](#12985)) ([00fae1b](00fae1b)) * client flows benchmarks ([#13007](#13007)) ([9bf7568](9bf7568)) * gas benchmark for "normal usage" ([#13073](#13073)) ([4eb1156](4eb1156)) * Implement merkle writes in the merkle check gadget ([#13050](#13050)) ([c94fe50](c94fe50)) * LogEncryption trait ([#12942](#12942)) ([0b7e564](0b7e564)) * Node snapshot sync ([#12927](#12927)) ([afde851](afde851)), closes [#12926](#12926) * **p2p:** capture all gossipsub metrics ([#12930](#12930)) ([cc940cb](cc940cb)) * Prover node snapshot sync ([#13097](#13097)) ([1e77efb](1e77efb)) * staking asset handler ([#12968](#12968)) ([af48184](af48184)), closes [#12932](#12932) * stream crs data to disk ([#12996](#12996)) ([d016e4d](d016e4d)), closes [#12948](#12948) * track failed tests. add flake. ([f4936d7](f4936d7)) * Track test history. ([#13037](#13037)) ([036bb32](036bb32)) * track total tx fee ([#12601](#12601)) ([9612a4e](9612a4e)) * Validators sentinel ([#12818](#12818)) ([770695c](770695c)) ### Bug Fixes * added #[derive(Eq)] to EcdsaPublicKeyNote ([#12966](#12966)) ([0c21c74](0c21c74)) * Allow use of local blob sink client ([#13025](#13025)) ([ba8d654](ba8d654)) * **avm:** semicolons are hard ([#12999](#12999)) ([8871c83](8871c83)) * bootstrap network and sponsored fpc devnet ([#13044](#13044)) ([8a47d8b](8a47d8b)) * Bump tsc target ([#13052](#13052)) ([985e83b](985e83b)) * cycle_group fuzzer ([#12921](#12921)) ([69f426e](69f426e)) * **docs:** Fix import errors in aztec.js tutorial ([#12969](#12969)) ([856208a](856208a)) * **docs:** Load token artifact from the compiled source in the sample dapp tutorial ([#12802](#12802)) ([0838084](0838084)), closes [#12810](#12810) * **docs:** Update sponsored fpc docs to use 82.2 syntax ([#13054](#13054)) ([e5d425b](e5d425b)) * **e2e:** p2p ([#13002](#13002)) ([1ece539](1ece539)) * extend e2e 2 pxes timeout. strip color codes for error_regex. ([73820e4](73820e4)) * flake ([6cc9e81](6cc9e81)) * fuzzer on staking asset handler constructor test ([#13101](#13101)) ([d936285](d936285)) * invalid getCommittee function ([#13072](#13072)) ([327341f](327341f)) * mac publish should use clang 18 like x-compiler, and use it ([#12983](#12983)) ([7b83c45](7b83c45)) * make circuit parsing deterministic ([#11772](#11772)) ([76ef873](76ef873)) * parse away trailing slash from consensus host ([#12577](#12577)) ([6701806](6701806)) * prerelease versions should be pushed to install.aztec.network ([#13086](#13086)) ([c4e6039](c4e6039)) * smoke ([#13060](#13060)) ([7756b15](7756b15)) * some flake additions ([58638f1](58638f1)) * sponsored fpc arg parsed correctly ([#12976](#12976)) ([#12977](#12977)) ([a85f530](a85f530)) * starting the sandbox with no pxe should still deploy initial test accounts ([#13047](#13047)) ([d92d895](d92d895)) * Syntax error when running tests via jest after tsc build ([#13051](#13051)) ([f972db9](f972db9)) * Use the correct image in aztec start ([#13058](#13058)) ([06285cd](06285cd)) * yolo fix ([91e2f4b](91e2f4b)) * yolo fix nightly ([b3b3259](b3b3259)) * yolo fix obvious thing to track fails. ([2fee630](2fee630)) * yolo flakes ([e3b030a](e3b030a)) * yolo set -x ([bfd3205](bfd3205)) * yolo we suspect the halt is making tests fail that would have passed ([04e3fa2](04e3fa2)) ### Miscellaneous * `getIndexedTaggingSecretAsSender` oracle cleanup ([#13015](#13015)) ([8e71e55](8e71e55)) * Add a script to generate cpp files for AVM2 ([#13091](#13091)) ([7bb43a9](7bb43a9)) * add default native proving for cli wallet ([#12855](#12855)) ([c0f773c](c0f773c)) * add default native proving for cli wallet retry ([#13028](#13028)) ([b2f4785](b2f4785)) * Alpha testnet into master ([#13033](#13033)) ([d98fdbd](d98fdbd)) * AVM TS - move tag validation outside of instruction constructors ([#13038](#13038)) ([45548ab](45548ab)), closes [#12934](#12934) * **avm:** final codegen nuking ([#13089](#13089)) ([9c82f3f](9c82f3f)) * **avm:** remove codegen (all but flavor) ([#13079](#13079)) ([e1f2bdd](e1f2bdd)) * **bb:** minor acir buf C++ improvements ([#13042](#13042)) ([1ebd044](1ebd044)) * boxes dep cleanup ([#12979](#12979)) ([6540b7c](6540b7c)) * **ci:** less catch all e2e_p2p flakes ([#12737](#12737)) ([2134634](2134634)) * comprehensive cleanup of translator flavor and use inheritance properly in flavors ([#13041](#13041)) ([dc5f78f](dc5f78f)) * compress storage footprint ([#12871](#12871)) ([58c110f](58c110f)) * display warning when installing bb versions < 0.82.0 ([#13027](#13027)) ([7247fe7](7247fe7)) * **docs:** Update docs on fees and various other updates ([#12929](#12929)) ([1dec907](1dec907)) * dump dmesg/net/cpu/mem usage at end of ci run ([#12967](#12967)) ([8877792](8877792)) * fix governance util issue ([#13043](#13043)) ([d768d26](d768d26)) * redundant if in affine from projective constructor ([#13045](#13045)) ([3a7ba2d](3a7ba2d)) * remove addition of dummy ops in mock circuit producer ([#13003](#13003)) ([a64d1dc](a64d1dc)) * remove dummy ops in decider pk ([#13049](#13049)) ([da6d021](da6d021)) * replace relative paths to noir-protocol-circuits ([e1b88f6](e1b88f6)) * replace relative paths to noir-protocol-circuits ([849b4b0](849b4b0)) * replace relative paths to noir-protocol-circuits ([18a02d6](18a02d6)) * Revert "chore: add default native proving for cli wallet ([#12855](#12855))" ([#13013](#13013)) ([98e2576](98e2576)) * Speed up and deflake sentinel test ([#13078](#13078)) ([27f1eca](27f1eca)) * **testnet:** making consensus host mandatory input ([#12716](#12716)) ([d47c74a](d47c74a)) * towards no more mock op_queues ([#12984](#12984)) ([fefffa7](fefffa7)) * update bb version for noir 1.0.0-beta.0+ ([#13026](#13026)) ([dd68074](dd68074)) * update CODEOWNERS to reflect new sync method ([#12998](#12998)) ([a3d1915](a3d1915)) ### Documentation * Add fees to cli reference ([#12884](#12884)) ([4a0fd58](4a0fd58)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Adds `msgpack` serialisation to the generated Acir and Witness C++ code. I moved the alterations described in `dsl/README.md` into the code generation itself, so no manual work is required. The PR is running against a feature branch with the same name in Noir, here's the upstream PR: noir-lang/noir#7716 With this PR is merged, `bb` should be able to handle `msgpack` or `bincode`. Once that's released we can switch to using `msgpack` in Noir in both native and wasm by merging noir-lang/noir#7810. And then we can remove the `msgpack` format detection and the fallback to `bincode`. **TODO**: - [x] Get it to compile - [x] Change `nargo` to allow compiling contracts with `msgpack` format: added `NOIR_SERIALIZATION_FORMAT` env var - [x] Add a first byte to the data to say which serialization format it is. There is a chance that it would clash with existing bincode data though, so a fallback would anyway be necessary. (Or we should ascertain that bincode never starts with some specific bit sequence). - [x] ~Change the `bb` code so it tries `bincode`, then falls back to `msgpack` - this way the currently used format stays fast, but we can feed it new data.~ _This looks problematic, as exceptions in the wasm build is disabled in `arch.cmake` and `throw_or_abort` aborts in wasm. Instead we run [msgpack::parse](https://c.msgpack.org/cpp/namespacemsgpack.html#ad844d148ad1ff6c9193b02529fe32968) first to check if the data looks like msgpack; if not, we use bincode._ - [x] Run integration tests with `msgpack` on both sides in AztecProtocol/aztec-packages#13021 - [x] Ignore the Brillig opcodes in Barretenberg - [x] Change the serialization of `WitnessStack` and `WitnessMap` to use the env var, add fallbacks in `bb` for them - [x] Revert the change to `noir-repo-ref` before merging ### Use of `MSGPACK_FIELDS` The generated code is using `MSGPACK_FIELDS` for structs, to keep it more terse. At some point during debugging the memory issue below I changed it so that I can have more direct control by generating code for individual fields. That needed some helper functions which I looted from the `msgpack-c` library and injected into the namespaces as a `Helpers` struct. This approach might be useful if we wanted to have extra checks, for example rejecting the data if there are extra fields, indicating a type has been extended with things we don't recognise, or if we wanted handle renamed fields. I left it out so there is less code to maintain, but if we need it we can recover it from the [commit history](noir-lang/noir@b0a612d). ### Compile `nargo` with the `msgpack` feature ```bash echo af/msgpack-codegen > noir/noir-repo-ref noir/bootstrap.sh ``` ### Generate and compile C++ code ```bash cd noir/noir-repo && NOIR_CODEGEN_OVERWRITE=1 cargo test -p acir cpp_codegen && cd - cp noir/noir-repo/acvm-repo/acir/codegen/acir.cpp barretenberg/cpp/src/barretenberg/dsl/acir_format/serde/acir.hpp cp noir/noir-repo/acvm-repo/acir/codegen/witness.cpp barretenberg/cpp/src/barretenberg/dsl/acir_format/serde/witness_stack.hpp cd barretenberg/cpp && ./format.sh changed && cd - barretenberg/cpp/bootstrap.sh ``` ### Test `nargo` with `bb` One example of an integration test that uses `bb` and noir in the Noir repo is https://github.com/noir-lang/noir/actions/runs/13631231158/job/38099477964 We can call it like this: ```bash cd noir/noir-repo && cargo install --path tooling/nargo_cli && cd - ./barretenberg/cpp/bootstrap.sh export BACKEND=$(pwd)/barretenberg/cpp/build/bin/bb export NOIR_SERIALIZATION_FORMAT=msgpack noir/noir-repo/examples/prove_and_verify/test.sh ``` If it works, it should print this: ```console % unset NOIR_SERIALIZATION_FORMAT % noir/noir-repo/examples/prove_and_verify/test.sh [hello_world] Circuit witness successfully solved [hello_world] Witness saved to /mnt/user-data/akosh/aztec-packages/noir/noir-repo/examples/prove_and_verify/target/witness.gz Finalized circuit size: 18 Proof saved to "./proofs/proof" Finalized circuit size: 18 VK saved to "./target/vk" Proof verified successfully ``` Whereas if it doesn't: ```console % export NOIR_SERIALIZATION_FORMAT=msgpack % noir/noir-repo/examples/prove_and_verify/test.sh [hello_world] Circuit witness successfully solved [hello_world] Witness saved to /mnt/user-data/akosh/aztec-packages/noir/noir-repo/examples/prove_and_verify/target/witness.gz Length is too large ``` I attached the final artefacts to the PR so it's easier to test with just `bb`. [hello_world.json](https://github.com/user-attachments/files/19391072/hello_world.json) [witness.gz](https://github.com/user-attachments/files/19391074/witness.gz) ### Further testing With the `noir-repo-ref` pointing at the feature `af/msgpack-codegen` feature branch, we can run all the contract compilations and tests with `msgpack` as follows: ```shell export NOIR_SERIALIZATION_FORMAT=msgpack ./bootstrap.sh ci ``` This is tested in AztecProtocol/aztec-packages#13021 ### Peek into artefacts We can inspect the file in JSON format using [this](https://crates.io/crates/msgpack-cli) msgpack CLI tool. ```shell jq -r '.bytecode' ./target/program.json | base64 --decode | gunzip | tail -c +2 | mpk --to-json | jq ``` Thanks Tom for the [spell](AztecProtocol/msgpack-c#5 (comment)) 🙏 ### False bug At some point I thought had to make some fixes in `msgpack-c` itself to make this work: AztecProtocol/msgpack-c#5 A similar [blocking bug](AztecProtocol/aztec-packages#12841 (comment)) was encountered when running the entire `ci` build with msgpack format. It turned out it was a [dangling pointer](msgpack/msgpack-c#695 (comment)) issue, fixed in AztecProtocol/aztec-packages@5810e3b Much of the comments below are related to my struggles that came from this mistake; you can ignore them.
Followup for #12841 Changes code generation for msgpack so that it doesn't throw an error if an optional field of a `struct` is not present in the data. This is to allow @TomAFrench and @asterite to delete `Opcode::MemoryOp::predicate` which is an optional field that we no longer use. Removing such a field should be a non-breaking change, as the field was optional to begin with, so while even if it's present in C++ it should already handle it being empty. Unfortunately the `msgpack-c` library as far as I can see [would throw an error](https://github.com/AztecProtocol/msgpack-c/blob/54e9865b84bbdc73cfbf8d1d437dbf769b64e386/include/msgpack/v1/adaptor/detail/cpp11_define_map.hpp#L33-L45) if the optional field would not be present in the data as NIL. For this to work the PR re-introduces the `Helpers` and enumerates fields explicitly, instead of using `MSGPACK_FIELDS`. I changed the unmerged noir-lang/noir#7716 to do codegen with this change. I rebased #13021 on top of this PR to see if it works when msgpack is actually in use. ### Note for future migration path @ludamad reached out that while the bytecode size increase shown in noir-lang/noir#7690 doesn't seem too bad, and compression compensates for the inclusion of string field names, it's still wasteful to have to parse them, and it would be better to use arrays. I established in [tests](https://github.com/noir-lang/noir/pull/7690/files#diff-2d66028e5a8966511a76d1740d752be294c0b6a46e0a567bc2959f91d9ce224bR169-R176) that we what we call `msgpack-compact` format uses arrays for structs, but still tags enums with types. We use a lot of enums, so there is still quite a few strings. Ostensibly we could use [serde attributes](https://serde.rs/container-attrs.html) to shorten names, but it would probably be a nightmare and ugly. Nevertheless if we generated C++ code to deal with arrays, we could save some space. And if we want to stick to `bincode`, we can use `NOIR_SERIALIZATION_FORMAT=bincode`, which I back ported to the Noir codegen PR, to generate `bincode` with a format byte marker. There is also `bincode-legacy`, but unfortunately we only have one shot at deserialising bincode in C++: if it fails, we can't catch the exception. Therefore the path to be able to use the bincode format marker is: 1. Release `bb` which can handle the `msgpack` format (which has a probe, doesn't have to throw) 2. Start producing msgpack data from `nargo` 3. Stop accepting unmarked bincode in `bb` and look for format byte == 1 to show that bincode is in use 4. Tell `nargo` which format to use EDIT: Unfortunately if we use `binpack` with today's data types it forces us to parse the Brillig part, as established by #13143 which would mean the Noir team can't make any changes to Brillig opcodes without breaking `bb`. We would need to change the format again to use two tier encoding, or use msgpack arrays.
Followup for AztecProtocol/aztec-packages#12841 Changes code generation for msgpack so that it doesn't throw an error if an optional field of a `struct` is not present in the data. This is to allow @TomAFrench and @asterite to delete `Opcode::MemoryOp::predicate` which is an optional field that we no longer use. Removing such a field should be a non-breaking change, as the field was optional to begin with, so while even if it's present in C++ it should already handle it being empty. Unfortunately the `msgpack-c` library as far as I can see [would throw an error](https://github.com/AztecProtocol/msgpack-c/blob/54e9865b84bbdc73cfbf8d1d437dbf769b64e386/include/msgpack/v1/adaptor/detail/cpp11_define_map.hpp#L33-L45) if the optional field would not be present in the data as NIL. For this to work the PR re-introduces the `Helpers` and enumerates fields explicitly, instead of using `MSGPACK_FIELDS`. I changed the unmerged noir-lang/noir#7716 to do codegen with this change. I rebased AztecProtocol/aztec-packages#13021 on top of this PR to see if it works when msgpack is actually in use. ### Note for future migration path @ludamad reached out that while the bytecode size increase shown in noir-lang/noir#7690 doesn't seem too bad, and compression compensates for the inclusion of string field names, it's still wasteful to have to parse them, and it would be better to use arrays. I established in [tests](https://github.com/noir-lang/noir/pull/7690/files#diff-2d66028e5a8966511a76d1740d752be294c0b6a46e0a567bc2959f91d9ce224bR169-R176) that we what we call `msgpack-compact` format uses arrays for structs, but still tags enums with types. We use a lot of enums, so there is still quite a few strings. Ostensibly we could use [serde attributes](https://serde.rs/container-attrs.html) to shorten names, but it would probably be a nightmare and ugly. Nevertheless if we generated C++ code to deal with arrays, we could save some space. And if we want to stick to `bincode`, we can use `NOIR_SERIALIZATION_FORMAT=bincode`, which I back ported to the Noir codegen PR, to generate `bincode` with a format byte marker. There is also `bincode-legacy`, but unfortunately we only have one shot at deserialising bincode in C++: if it fails, we can't catch the exception. Therefore the path to be able to use the bincode format marker is: 1. Release `bb` which can handle the `msgpack` format (which has a probe, doesn't have to throw) 2. Start producing msgpack data from `nargo` 3. Stop accepting unmarked bincode in `bb` and look for format byte == 1 to show that bincode is in use 4. Tell `nargo` which format to use EDIT: Unfortunately if we use `binpack` with today's data types it forces us to parse the Brillig part, as established by AztecProtocol/aztec-packages#13143 which would mean the Noir team can't make any changes to Brillig opcodes without breaking `bb`. We would need to change the format again to use two tier encoding, or use msgpack arrays.
#18937) Brings the changes in the generated C++ for deserialising programs from noir-lang/noir#10878 adding support for the `msgpack-compact` format. Removes the exception if `msgpack` or `msgpack-compact` format is encountered in the format marker. Tested the same way as #12841 but with `export NOIR_SERIALIZATION_FORMAT=msgpack-compact`.
Adds
msgpackserialisation to the generated Acir and Witness C++ code.I moved the alterations described in
dsl/README.mdinto the code generation itself, so no manual work is required. The PR is running against a feature branch with the same name in Noir, here's the upstream PR: noir-lang/noir#7716With this PR is merged,
bbshould be able to handlemsgpackorbincode. Once that's released we can switch to usingmsgpackin Noir in both native and wasm by merging noir-lang/noir#7810. And then we can remove themsgpackformat detection and the fallback tobincode.TODO:
nargoto allow compiling contracts withmsgpackformat: addedNOIR_SERIALIZATION_FORMATenv varChange theThis looks problematic, as exceptions in the wasm build is disabled inbbcode so it triesbincode, then falls back tomsgpack- this way the currently used format stays fast, but we can feed it new data.arch.cmakeandthrow_or_abortaborts in wasm. Instead we run msgpack::parse first to check if the data looks like msgpack; if not, we use bincode.msgpackon both sides in test(DO NOT MERGE): Try running integration tests withmsgpack#13021WitnessStackandWitnessMapto use the env var, add fallbacks inbbfor themnoir-repo-refbefore mergingUse of
MSGPACK_FIELDSThe generated code is using
MSGPACK_FIELDSfor structs, to keep it more terse.At some point during debugging the memory issue below I changed it so that I can have more direct control by generating code for individual fields. That needed some helper functions which I looted from the
msgpack-clibrary and injected into the namespaces as aHelpersstruct. This approach might be useful if we wanted to have extra checks, for example rejecting the data if there are extra fields, indicating a type has been extended with things we don't recognise, or if we wanted handle renamed fields. I left it out so there is less code to maintain, but if we need it we can recover it from the commit history.I'm not actually sure if the code that handles structs can deal with missing optional fields, which could be another reason to bypass the macro.
Compile
nargowith themsgpackfeatureGenerate and compile C++ code
Try the following for a faster
bbbuild:Test
nargowithbbOne example of an integration test that uses
bband noir in the Noir repo is https://github.com/noir-lang/noir/actions/runs/13631231158/job/38099477964We can call it like this:
If it works, it should print this:
Whereas if it doesn't:
I attached the final artefacts to the PR so it's easier to test with just
bb.hello_world.json
witness.gz
Further testing
With the
noir-repo-refpointing at the featureaf/msgpack-codegenfeature branch, we can run all the contract compilations and tests withmsgpackas follows:export NOIR_SERIALIZATION_FORMAT=msgpack ./bootstrap.sh ciThis is tested in #13021
Peek into artefacts
We can inspect the file in JSON format using this msgpack CLI tool.
Thanks Tom for the spell 🙏
False bug
At some point I thought had to make some fixes in
msgpack-citself to make this work: AztecProtocol/msgpack-c#5A similar blocking bug was encountered when running the entire
cibuild with msgpack format.It turned out it was a dangling pointer issue, fixed in 5810e3b
Much of the comments below are related to my struggles that came from this mistake; you can ignore them.