Skip to content

Add support for importing asm-json. #11807

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 19 commits into from
Closed

Add support for importing asm-json. #11807

wants to merge 19 commits into from

Conversation

aarlt
Copy link
Member

@aarlt aarlt commented Aug 16, 2021

Closes #11787.

  • the import can be done with --import-asm-json.
  • support for --opcodes, --asm and --bin and --bin-runtime, if --import-asm-json is used.
  • add additional tests that reimport the following assembly items:
    • PUSH tag
    • PUSH [ErrorTag]
    • PUSHLIB
    • PUSHDEPLOYADDRESS
    • PUSHIMMUTABLE
    • VERBATIM
  • support combinedJson for --import-asm-json
    • srcmap, srcmap-runtime
      • add import tests
    • check for invalid combinedJson parameters
  • support optimisation
  • AsmJsonImportTest.sh: also compare optimiser output

Note:

@aarlt aarlt marked this pull request as draft August 16, 2021 23:16
@aarlt

This comment has been minimized.

Comment on lines 584 to 587
("Import assembly to be used, assumes input holds assembly in JSON format. "
"Supported Input is the output of the --" + g_strAsmJson + ".").c_str()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be clarified a bit more, I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, what should be clarified a bit more?

@aarlt
Copy link
Member Author

aarlt commented Aug 20, 2021

Ok, finally I found out what created the mismatching bytecode. I added the auxiliary data to the wrong assembly object.

@ekpyron
Copy link
Member

ekpyron commented Aug 20, 2021

If exporting and importing anything really works and produces identical bytecode everywhere, it would probably be nice to add a test run similar to the AST import tests, that just compiles all sources in the repo once directly to bytecode and once via exporting assembly and then importing it back first, comparing the resulting bytecode :-).
If we do that, then one more dedicated test with an artificial assembly JSON containing one of each possible assembly items would probably be enough.

@aarlt aarlt force-pushed the asm-json-reimport branch 3 times, most recently from 65c86cc to d120ae4 Compare August 26, 2021 15:44
@ethereum ethereum deleted a comment from stackenbotten Aug 26, 2021
@ethereum ethereum deleted a comment from stackenbotten Aug 26, 2021
@ethereum ethereum deleted a comment from stackenbotten Aug 26, 2021
@ethereum ethereum deleted a comment from stackenbotten Aug 26, 2021
@ethereum ethereum deleted a comment from stackenbotten Aug 26, 2021
@aarlt
Copy link
Member Author

aarlt commented Aug 26, 2021

After checking what assembly items where exactly imported by AsmJsonImportTest.sh, it turned out that the following assembly items never get imported due to missing tests.

  • PUSH tag
  • PUSH [ErrorTag]
  • PUSHLIB
  • PUSHDEPLOYADDRESS
  • PUSHIMMUTABLE
  • VERBATIM

TODO: Add specific tests for these assembly items.

@aarlt aarlt force-pushed the asm-json-reimport branch 2 times, most recently from 413ff83 to f89d6b8 Compare August 26, 2021 23:33
@aarlt
Copy link
Member Author

aarlt commented Aug 29, 2021

Added json re-import - and byte code equivalence test to test/libevmasm/Assembler.cpp.
Still missing assembly items:

  • PUSH tag
  • PUSH [ErrorTag]
  • VERBATIM

@aarlt aarlt force-pushed the asm-json-reimport branch from f89d6b8 to 7238056 Compare August 29, 2021 23:53

auto updateLibraries = [&](string const& value) -> h256 {
h256 hash(util::keccak256(value));
_assembly.m_libraries[hash] = value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to check that it's a valid address (length, chars, maybe checksum?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not using any addresses here. value is just a name of a library (I will change the parameter name). The hash is just used as a placeholder, where linking will replace the placeholder with the actual library address.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess linking will do some checks on the supplied addresses, but I haven't checked this yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, these are the placeholders.

Still, the placeholders need to have a specific length, start and end with __, etc. Linking is done in LinkerObject.cpp and it does not seem to have any validations (or even assertions) against that. AFAIK all the validation is in the CLI and StandardCompiler.

BTW, some other parts of this JSON structure might need some validation too.

@@ -227,6 +229,221 @@ string Assembly::toStringInHex(u256 _value)
return hexStr.str();
}

AssemblyItem Assembly::assemblyItemFromJSON(Json::Value const& _json, Assembly& _assembly)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I thought this just returns a new item but then I noticed that the function actually modifies _assembly. I think the name should reflect that.

Actually, maybe a better design would be to have two functions - one that just gives you a list of items (and does not modify the assembly) and another one that takes the list and updates the assembly based on it? If you had to explicitly run the update function, it would be clearer that an update is taking place.

root[".auxdata"] = toHex(m_auxiliaryData);

return root;
}

void Assembly::forEachJSONAssemblyItem(Json::Value const& _code, Assembly& _assembly)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the change I suggested above this function would get a list of assembly items. You would not have to call assemblyItemFromJSON() twice for each item (which is especially bad given that assemblyItemFromJSON has side-effects - it modifies the assembly).

!m_compiler->compilationSuccessful() &&
m_options.output.stopAfter == CompilerStack::State::CompilationSuccessful
)
if (m_assembly)
Copy link
Member

@cameel cameel Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can m_assembly and m_compiler both be present at the same time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, only m_assembly or m_compiler is possible at the same time. Change this to else if.

@@ -1187,7 +1195,7 @@ General Information)").c_str(),
if (m_options.input.mode == InputMode::Compiler)
m_options.input.errorRecovery = (m_args.count(g_strErrorRecovery) > 0);

solAssert(m_options.input.mode == InputMode::Compiler || m_options.input.mode == InputMode::CompilerWithASTImport, "");
solAssert(m_options.input.mode == InputMode::AssemblyJson || m_options.input.mode == InputMode::Compiler || m_options.input.mode == InputMode::CompilerWithASTImport, "");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this new input mode support all of the arguments that the compiler does? If not, I think it should make the function return at a much earlier point.

Also, which other command-line flags are usable in this mode? There's an issue underway to report errors for flags invalid in other modes (#11629) and I think we should start rejecting invalid flags in this new mode from the very beginning.

Each input mode has a test in test/solc/CommandLineParser.cpp that checks all the options valid in that mode. This new mode should have such a test too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a very simple check in solc/CommandLineParser.cpp. Basically with the new import mode only --asm, --bin and --opcodes are right now supported.

@@ -49,6 +49,7 @@ enum class InputMode
StandardJson,
Linker,
Assembler,
AssemblyJson,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assembly is a bit ambiguous. On the CLI it usually refers to Yul. Maybe we should call it EVMAssemblyJson instead? Or rename Assembler mode to Yul?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about ImportAssemblyJson?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem I see is just that it might not be clear that Assembly refers to EVM assembly rather than Yul here. On the other hand the class is named Assembly, so maybe it's not such a big problem...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just changed that now to ImportEVMAssemblyJson.

@aarlt aarlt force-pushed the asm-json-reimport branch 2 times, most recently from 9a65e7e to c30f027 Compare September 6, 2021 04:17
@aarlt aarlt force-pushed the asm-json-reimport branch 2 times, most recently from acdbc32 to 2e4ddd9 Compare September 6, 2021 16:06
@aarlt aarlt force-pushed the asm-json-reimport branch from 3c3eb0e to 5c3a618 Compare November 12, 2021 15:35
@aarlt aarlt force-pushed the asm-json-reimport branch from 73a7a6e to 9721c82 Compare November 16, 2021 02:49
@ethereum ethereum deleted a comment from stackenbotten Nov 16, 2021
@ethereum ethereum deleted a comment from stackenbotten Nov 16, 2021
@ethereum ethereum deleted a comment from stackenbotten Nov 16, 2021
@ethereum ethereum deleted a comment from stackenbotten Nov 16, 2021
@ethereum ethereum deleted a comment from stackenbotten Nov 16, 2021
@ethereum ethereum deleted a comment from stackenbotten Nov 16, 2021
@ethereum ethereum deleted a comment from stackenbotten Nov 16, 2021
@ethereum ethereum deleted a comment from stackenbotten Nov 16, 2021
@aarlt
Copy link
Member Author

aarlt commented Feb 28, 2022

Replaced by #12704.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compiler mode to reimport assembly json.
6 participants