Add support to import evm assembly json (updated). #13673

aarlt · 2022-11-02T05:10:00Z

~~Target branch is import-export-script-refactoring (see #13576).~~

Depends on ~~#13576~~ (merged), ~~#13577~~ (merged), ~~#13578~~ (merged)and ~~#13579~~ (merged)

Closes #11787. Replaces #12834.

TODO

Rename scripts/ASTImportTest.sh to scripts/ImportExportTest.sh (will be done as last step before merging)
Update import/export test to cover the cases described here: Add support to import evm assembly json (updated). #13673 (comment)
Run EVM Assembly and AST import/export equivalence tests in parallel
Consider refactoring Assembly.cpp to account for this comment: Add support to import evm assembly json (updated). #13673 (comment) (Make a separated PR)
Create a separated PR adding support to --asm-json output in the assembler mode (based on this comment: Add support to import evm assembly json (updated). #13673 (comment))
Refactor CommandLineInterface::handleCombinedJSON() to account for: Add support to import evm assembly json (updated). #13673 (comment)

aarlt · 2022-11-02T06:16:35Z

Execution of the EVM Assembly JSON import/export tests need ~20 minutes.

nikola-matic · 2022-11-02T14:29:50Z

You got tons of std:: qualifier usages in cpp files here by the way.

libevmasm/Assembly.cpp

nikola-matic · 2022-11-02T14:42:52Z

Also, changelog entry since this introduces a new input mode right?

nikola-matic · 2022-11-03T13:18:32Z

This PR will replace #12834.

PR contains cherry-picked commits from #13576 - once these PRs are merged, it's commits will be removed from this PR.

Depends on #13576, ~~#13577~~, ~~#13578~~ and ~~#13579~~

Closes #11787.

So I see that #13576 is a dependency, but it looks like you've deleted (or renamed) ASTImportTest.sh; is it still a dependency, and how do you plan to resolve #13576 so it's aligned with this PR?

nikola-matic · 2022-11-03T13:19:12Z

Also, if you rebase against develop, you can finally get them sweet sweet green builds.

aarlt · 2022-11-03T15:59:44Z

This PR will replace #12834.
PR contains cherry-picked commits from #13576 - once these PRs are merged, it's commits will be removed from this PR.
Depends on #13576, ~~#13577~~, ~~#13578~~ and ~~#13579~~
Closes #11787.

So I see that #13576 is a dependency, but it looks like you've deleted (or renamed) ASTImportTest.sh; is it still a dependency, and how do you plan to resolve #13576 so it's aligned with this PR?

Yeah, initially we had import/export tests only for the AST (ASTImportTest.sh). But with this PR we will have an additional kind of import/export tests for the EVM Assembly. So now the script will support AST & EVM Assembly import/export tests, that's why I just renamed it to ImportExportTest.sh.

In this PR the renaming is done in a single commit. But we could also rename the script later. It is then probably easier to read the PR.

libevmasm/Assembly.cpp

libsolidity/interface/CompilerStack.cpp

nikola-matic · 2022-11-03T18:37:27Z

libevmasm/Assembly.cpp

+		{
+			shared_ptr<Assembly> subassembly(Assembly::loadFromAssemblyJSON(code, sourceList, /* isCreation = */ false));
+			assertThrow(subassembly, AssemblyException, "");
+			result->m_subs.emplace_back(make_shared<Assembly>(*subassembly));


Could m_subs be a vector of unique_ptrs? If it can, it should, since from what I can tell, we're only iterating over its contents later on, so no need for this to be a sharted_ptr.

Maybe that's possible to change that, but it would be an unrelated change. But if it makes sense, we could create another PR changing this.

cameel · 2023-10-20T18:39:29Z

libevmasm/EVMAssemblyStack.cpp

-	solRequire(jsonParseStrict(_source, assemblyJson), AssemblyImportException, "Could not parse JSON file.");
+	solRequire(jsonParse(_source, assemblyJson), AssemblyImportException, "Could not parse JSON file.");


How non-strict is jsonParse()? If the difference is just the null, it's fine, but I have a feeling that this will affect also some other aspects of parsing that we'd rather keep strict :)

Not restrictive at all, it uses the default settings. Please see setDefaults() and strictMode() for a comparison.

But we could change the strictRoot field in the strict settings to false, which will allow us to accept Json::null, however we need to double check possible consequences of this. The commit 772ff3e was just an example of the issue.

I reverted the changes anyway ;)

- (*settings)["collectComments"] = true; - (*settings)["allowComments"] = true; - (*settings)["allowTrailingCommas"] = true; - (*settings)["strictRoot"] = false; + (*settings)["allowComments"] = false; + (*settings)["allowTrailingCommas"] = false; + (*settings)["strictRoot"] = true; (*settings)["allowDroppedNullPlaceholders"] = false; (*settings)["allowNumericKeys"] = false; (*settings)["allowSingleQuotes"] = false; (*settings)["stackLimit"] = 1000; - (*settings)["failIfExtra"] = false; - (*settings)["rejectDupKeys"] = false; + (*settings)["failIfExtra"] = true; + (*settings)["rejectDupKeys"] = true; (*settings)["allowSpecialFloats"] = false; (*settings)["skipBom"] = true;

I see that there are some important differences. For example rejectDupKeys does not sound like something we'd like to disable. And even things like comments or trailing commas would be something we could not roll back later in a non-breaking way when we finally switch to a better maintained JSON library.

Using strictMode() with just the strictRoot flag flipped would probably be acceptable, but still, this is really something better done in a separate PR, because here there are too many other concerns mixed in. We do need to consider whether the new JSON parser will allow that or not.

test/cmdlineTests/asm_json_import_all_valid_flags/output

cameel · 2023-10-20T20:45:49Z

test/cmdlineTests/asm_json_import_verbatim/output

+  dup3
+    /* "<stdin>":51:63  : 29,... */
+  sstore
+    /* "<stdin>":84:86   " */
+  0x20
+    /* "<stdin>":71:87  "PUSH",... */
+  calldataload
+    /* "<stdin>":68:188  ": "PUSH",... */


These code snippets should not be printed. The input file (<stdin>) is listed as one of the sources, which makes no sense but makes the compiler use it as the source file. The thing is, in the EVM asm import mode we never have access to sources so while printing locations makes sense, the snippets should be disabled. I.e. of all the values supported by --debug-info (ast-id location snippet) only location should be allowed. And the default should be different.

cameel

I'm finally done with a comprehensive review of this, including reviewing past comments. There were quite a few tweaks that I did myself as fixups. For the rest I added comments, but since this PR has tons of comments, here's a summary.

These are IMO both important and easy enough to still fix before we merge:

These are less important issues and improvements that would be fine done later, in follow-up PRs:

And these are general issues that I pointed out, but probably don't have a good, quick solution:

nikola-matic · 2023-10-23T10:00:48Z

libevmasm/Assembly.cpp

+		if (!jumpType.empty())
+		{
+			if (item.instruction() == Instruction::JUMP || item.instruction() == Instruction::JUMPI)
+			{
+				std::optional<AssemblyItem::JumpType> parsedJumpType = AssemblyItem::parseJumpType(jumpType);
+				if (!parsedJumpType.has_value())
+					solThrow(AssemblyImportException, "Invalid jump type.");
+				item.setJumpType(parsedJumpType.value());
+			}
+			else
+				solThrow(
+					AssemblyImportException,
+					"Member 'jumpType' set on instruction different from JUMP or JUMPI (was set on instruction '" + name + "')"
+				);
+		}


Suggested change

if (!jumpType.empty())

{

if (item.instruction() == Instruction::JUMP || item.instruction() == Instruction::JUMPI)

{

std::optional<AssemblyItem::JumpType> parsedJumpType = AssemblyItem::parseJumpType(jumpType);

if (!parsedJumpType.has_value())

solThrow(AssemblyImportException, "Invalid jump type.");

item.setJumpType(parsedJumpType.value());

}

else

solThrow(

AssemblyImportException,

"Member 'jumpType' set on instruction different from JUMP or JUMPI (was set on instruction '" + name + "')"

);

}

if (!jumpType.empty())

if (item.instruction() == Instruction::JUMP || item.instruction() == Instruction::JUMPI)

{

std::optional<AssemblyItem::JumpType> parsedJumpType = AssemblyItem::parseJumpType(jumpType);

if (!parsedJumpType.has_value())

solThrow(AssemblyImportException, "Invalid jump type.");

item.setJumpType(parsedJumpType.value());

}

else

solThrow(

AssemblyImportException,

"Member 'jumpType' set on instruction different from JUMP or JUMPI (was set on instruction '" + name + "')"

);

Honestly, I'd be perfectly fine leaving this as is, since it's not checked by check_style.sh, and we'll eventually move to clang-format, where we'll have to auto format the whole project anyway. At your discretion then.

when clang-format? haha

libevmasm/Assembly.cpp

nikola-matic · 2023-10-23T10:11:24Z

libevmasm/Assembly.cpp

+		if (!jumpType.empty())
+		{
+			if (item.instruction() == Instruction::JUMP || item.instruction() == Instruction::JUMPI)
+			{
+				std::optional<AssemblyItem::JumpType> parsedJumpType = AssemblyItem::parseJumpType(jumpType);
+				if (!parsedJumpType.has_value())
+					solThrow(AssemblyImportException, "Invalid jump type.");
+				item.setJumpType(parsedJumpType.value());
+			}
+			else
+				solThrow(
+					AssemblyImportException,
+					"Member 'jumpType' set on instruction different from JUMP or JUMPI (was set on instruction '" + name + "')"
+				);
+		}


Suggested change

if (!jumpType.empty())

{

if (item.instruction() == Instruction::JUMP || item.instruction() == Instruction::JUMPI)

{

std::optional<AssemblyItem::JumpType> parsedJumpType = AssemblyItem::parseJumpType(jumpType);

if (!parsedJumpType.has_value())

solThrow(AssemblyImportException, "Invalid jump type.");

item.setJumpType(parsedJumpType.value());

}

else

solThrow(

AssemblyImportException,

"Member 'jumpType' set on instruction different from JUMP or JUMPI (was set on instruction '" + name + "')"

);

}

if (!jumpType.empty())

if (item.instruction() == Instruction::JUMP || item.instruction() == Instruction::JUMPI)

{

std::optional<AssemblyItem::JumpType> parsedJumpType = AssemblyItem::parseJumpType(jumpType);

if (!parsedJumpType.has_value())

solThrow(AssemblyImportException, "Invalid jump type.");

item.setJumpType(parsedJumpType.value());

}

else

solThrow(

AssemblyImportException,

"Member 'jumpType' set on instruction different from JUMP or JUMPI (was set on instruction '" + name + "')"

);

Again, just pointing out - feel free to ignore (or not if Kamil sees this).

nikola-matic · 2023-10-23T10:19:15Z

libevmasm/Assembly.cpp

+		for (Json::Value const& sourceName: _json["sourceList"])
+		{
+			solRequire(
+				std::find(parsedSourceList.begin(), parsedSourceList.end(), sourceName.asString()) == parsedSourceList.end(),


This is basically just a nested for loop - so a bit wasteful; wouldn't it make more sense to make parsedSourceList a std::set<std::string> and use parsedSourceSet.count(sourceName.asString()) instead, and then just copy it to a vector when/if needed, i.e. std::vector<std::string> parsedSourceList(parsedSourceSet.begin(), parsedSourceSet.end())?

Wouldn't that destroy the order though? The order here is very important.

But other than that, yeah, that would be more efficient for long lists.

Yes, it would, in which case we couldn't simply copy content over from the set into the vector, but could none the less use the same approach (i.e. using a set for a presence check, which technically wouldn't be any less efficient.

In that case it would be better to first load all items into a vector and then only use the set as a temporary helper to check if items are unique. Even better, wrap that in a reusable isUnique() helper. Or maybe such a helper even already exists in boost?

But IMO this is also good enough as is, given that the feature is experimental and we gave some other inefficiencies and suboptimal things a pass too. This could be improved to the follow-up refactor PR.

nikola-matic · 2023-10-23T10:21:44Z

libevmasm/Assembly.cpp

+		{
+			solAssert(dataIter.key().isString());
+			std::string dataItemID = dataIter.key().asString();
+			Json::Value const& dataItem = data[dataItemID];


This is a bit convoluted - can't we just get dataItem directly from dataIter?

I thought the same thing and apparently we can't. Though I only briefly looked at how the iterator is defined. I didn't see a member for it, but maybe you can find some mechanism for it if you look closer. In any case, I gave up myself on this because there were too many other things to adjust here.

Dereferencing the iterator should work, no?

Maybe. Feel free to check and suggest a variant that will work :)

Looks like it works. In any case, regarding your latest comment about priorities - I wouldn't worry about this too much, and I'd prefer not to push to the branch in case @r0qs is actively working on (which I'm assuming he is). So we can leave it for another time, unless of course, he notices this comment and does it himself :)

Pushing should not be a problem as long as you use --force-with-lease and only push new commits without modifying any existing ones or rebasing the whole branch. Then for @r0qs it would be a simple git rebase -i origin/import-asm-json-updated (and maybe some conflict resolutions, but this is a small localized change and everyone should be used to resolving conflicts by now anyway :)).

libevmasm/Assembly.cpp

cameel · 2023-10-23T11:38:37Z

Just to keep priorities clear here - we absolutely need this merged very soon, preferably today so I think @r0qs should focus on the three must-haves I listed in #13673 (review) - unless of course someone finds something serious, like a bug.

For the other minor improvements and refactors - feel free to just push fixups directly. Or just leave them be for now to be potentially addressed in some follow-up cleanup PR. I already fixed a lot of these myself while reviewing this last week. All the big, structural problems, like the import being part of CompilerStack are fixed at this point and remaining ones IMO are not serious enough to block the PR.

libevmasm/EVMAssemblyStack.cpp

libevmasm/Assembly.cpp

test/cmdlineTests/asm_json_import_missing_subobjects_indices/stdin

Changelog.md

…ly named dir

Co-authored-by: Alexander Arlt <[email protected]> Co-authored-by: r0qs <[email protected]>

Co-authored-by: Kamil Śliwak <[email protected]> Co-authored-by: r0qs <[email protected]>

All the important problems have been addressed.

aarlt self-assigned this Nov 2, 2022

aarlt force-pushed the import-asm-json-updated branch from 1eb319b to 9651527 Compare November 2, 2022 13:49

nikola-matic reviewed Nov 2, 2022

View reviewed changes

libevmasm/Assembly.cpp Outdated Show resolved Hide resolved

github-actions bot added the external contribution ⭐ label Nov 2, 2022

argotorg deleted a comment from github-actions bot Nov 2, 2022

r0qs removed the external contribution ⭐ label Nov 2, 2022

aarlt force-pushed the import-asm-json-updated branch 2 times, most recently from 79a3849 to 27da0cd Compare November 3, 2022 15:48

argotorg deleted a comment from stackenbotten Nov 3, 2022

aarlt force-pushed the import-asm-json-updated branch from 27da0cd to 858ed50 Compare November 3, 2022 15:55

nikola-matic reviewed Nov 3, 2022

View reviewed changes

libevmasm/Assembly.cpp Outdated Show resolved Hide resolved

nikola-matic reviewed Nov 3, 2022

View reviewed changes

libsolidity/interface/CompilerStack.cpp Outdated Show resolved Hide resolved

nikola-matic reviewed Nov 3, 2022

View reviewed changes

cameel reviewed Oct 20, 2023

View reviewed changes

cameel force-pushed the import-asm-json-updated branch from 5794f98 to a2fdac2 Compare October 20, 2023 19:36

cameel reviewed Oct 20, 2023

View reviewed changes

test/cmdlineTests/asm_json_import_all_valid_flags/output Show resolved Hide resolved

cameel reviewed Oct 20, 2023

View reviewed changes

cameel force-pushed the import-asm-json-updated branch from a2fdac2 to 1bae0e1 Compare October 20, 2023 20:49

cameel reviewed Oct 20, 2023

View reviewed changes

r0qs force-pushed the import-asm-json-updated branch from 1bae0e1 to e33f75b Compare October 23, 2023 07:52

nikola-matic reviewed Oct 23, 2023

View reviewed changes

cameel reviewed Oct 23, 2023

View reviewed changes

libevmasm/EVMAssemblyStack.cpp Outdated Show resolved Hide resolved

r0qs force-pushed the import-asm-json-updated branch from e4e6f59 to fa75409 Compare October 23, 2023 14:55

cameel reviewed Oct 23, 2023

View reviewed changes

libevmasm/Assembly.cpp Outdated Show resolved Hide resolved

cameel reviewed Oct 23, 2023

View reviewed changes

test/cmdlineTests/asm_json_import_missing_subobjects_indices/stdin Outdated Show resolved Hide resolved

r0qs force-pushed the import-asm-json-updated branch 3 times, most recently from 840e7e3 to 9d0a4de Compare October 23, 2023 22:45

ekpyron requested changes Oct 24, 2023

View reviewed changes

Changelog.md Outdated Show resolved Hide resolved

cameel force-pushed the import-asm-json-updated branch from 9d0a4de to c3825ed Compare October 24, 2023 14:10

cameel and others added 3 commits October 24, 2023 16:46

ASTImportTest.sh: Keep temporary files in subdirs of a single, proper…

e8b898c

…ly named dir

Introduce AbstractAssemblyStack interface

3e823e0

Co-authored-by: Alexander Arlt <[email protected]> Co-authored-by: r0qs <[email protected]>

EVM assembly import via --import-asm-json

67e11fc

Co-authored-by: Kamil Śliwak <[email protected]> Co-authored-by: r0qs <[email protected]>

cameel force-pushed the import-asm-json-updated branch from c3825ed to da10cb9 Compare October 24, 2023 14:51

CLI test for mass EVM assembly import/export

91c7b32

Co-authored-by: Kamil Śliwak <[email protected]> Co-authored-by: r0qs <[email protected]>

cameel force-pushed the import-asm-json-updated branch from da10cb9 to 91c7b32 Compare October 24, 2023 15:07

ekpyron approved these changes Oct 24, 2023

View reviewed changes

ekpyron merged commit c7e5212 into develop Oct 24, 2023

ekpyron deleted the import-asm-json-updated branch October 24, 2023 17:41

stackenbotten3000 added this to Solidity Focus Board Aug 28, 2025

stackenbotten3000 moved this to Done in Solidity Focus Board Aug 28, 2025

		solRequire(jsonParseStrict(_source, assemblyJson), AssemblyImportException, "Could not parse JSON file.");
		solRequire(jsonParse(_source, assemblyJson), AssemblyImportException, "Could not parse JSON file.");

Add support to import evm assembly json (updated). #13673

Add support to import evm assembly json (updated). #13673

Uh oh!

Conversation

aarlt commented Nov 2, 2022 • edited by r0qs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aarlt commented Nov 2, 2022

Uh oh!

nikola-matic commented Nov 2, 2022

Uh oh!

Uh oh!

nikola-matic commented Nov 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikola-matic commented Nov 3, 2022

Uh oh!

nikola-matic commented Nov 3, 2022

Uh oh!

aarlt commented Nov 3, 2022

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cameel left a comment • edited by r0qs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cameel Oct 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cameel commented Oct 23, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aarlt commented Nov 2, 2022 •

edited by r0qs

Loading

nikola-matic commented Nov 2, 2022 •

edited

Loading

cameel left a comment •

edited by r0qs

Loading

cameel Oct 23, 2023 •

edited

Loading