-
Notifications
You must be signed in to change notification settings - Fork 6.2k
Add support to import evm assembly json (updated). #13673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Execution of the EVM Assembly JSON import/export tests need ~20 minutes. |
1eb319b
to
9651527
Compare
You got tons of |
Also, changelog entry since this introduces a new input mode right? |
So I see that #13576 is a dependency, but it looks like you've deleted (or renamed) |
Also, if you rebase against develop, you can finally get them sweet sweet green builds. |
79a3849
to
27da0cd
Compare
27da0cd
to
858ed50
Compare
Yeah, initially we had import/export tests only for the AST ( In this PR the renaming is done in a single commit. But we could also rename the script later. It is then probably easier to read the PR. |
libevmasm/Assembly.cpp
Outdated
{ | ||
shared_ptr<Assembly> subassembly(Assembly::loadFromAssemblyJSON(code, sourceList, /* isCreation = */ false)); | ||
assertThrow(subassembly, AssemblyException, ""); | ||
result->m_subs.emplace_back(make_shared<Assembly>(*subassembly)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could m_subs
be a vector of unique_ptrs? If it can, it should, since from what I can tell, we're only iterating over its contents later on, so no need for this to be a sharted_ptr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe that's possible to change that, but it would be an unrelated change. But if it makes sense, we could create another PR changing this.
libevmasm/EVMAssemblyStack.cpp
Outdated
solRequire(jsonParseStrict(_source, assemblyJson), AssemblyImportException, "Could not parse JSON file."); | ||
solRequire(jsonParse(_source, assemblyJson), AssemblyImportException, "Could not parse JSON file."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How non-strict is jsonParse()
? If the difference is just the null
, it's fine, but I have a feeling that this will affect also some other aspects of parsing that we'd rather keep strict :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not restrictive at all, it uses the default settings. Please see setDefaults() and strictMode() for a comparison.
But we could change the strictRoot
field in the strict settings to false
, which will allow us to accept Json::null
, however we need to double check possible consequences of this. The commit 772ff3e was just an example of the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted the changes anyway ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- (*settings)["collectComments"] = true;
- (*settings)["allowComments"] = true;
- (*settings)["allowTrailingCommas"] = true;
- (*settings)["strictRoot"] = false;
+ (*settings)["allowComments"] = false;
+ (*settings)["allowTrailingCommas"] = false;
+ (*settings)["strictRoot"] = true;
(*settings)["allowDroppedNullPlaceholders"] = false;
(*settings)["allowNumericKeys"] = false;
(*settings)["allowSingleQuotes"] = false;
(*settings)["stackLimit"] = 1000;
- (*settings)["failIfExtra"] = false;
- (*settings)["rejectDupKeys"] = false;
+ (*settings)["failIfExtra"] = true;
+ (*settings)["rejectDupKeys"] = true;
(*settings)["allowSpecialFloats"] = false;
(*settings)["skipBom"] = true;
I see that there are some important differences. For example rejectDupKeys
does not sound like something we'd like to disable. And even things like comments or trailing commas would be something we could not roll back later in a non-breaking way when we finally switch to a better maintained JSON library.
Using strictMode()
with just the strictRoot
flag flipped would probably be acceptable, but still, this is really something better done in a separate PR, because here there are too many other concerns mixed in. We do need to consider whether the new JSON parser will allow that or not.
5794f98
to
a2fdac2
Compare
dup3 | ||
/* "<stdin>":51:63 : 29,... */ | ||
sstore | ||
/* "<stdin>":84:86 " */ | ||
0x20 | ||
/* "<stdin>":71:87 "PUSH",... */ | ||
calldataload | ||
/* "<stdin>":68:188 ": "PUSH",... */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These code snippets should not be printed. The input file (<stdin>
) is listed as one of the sources, which makes no sense but makes the compiler use it as the source file. The thing is, in the EVM asm import mode we never have access to sources so while printing locations makes sense, the snippets should be disabled. I.e. of all the values supported by --debug-info
(ast-id
location
snippet
) only location
should be allowed. And the default should be different.
a2fdac2
to
1bae0e1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm finally done with a comprehensive review of this, including reviewing past comments. There were quite a few tweaks that I did myself as fixups. For the rest I added comments, but since this PR has tons of comments, here's a summary.
These are IMO both important and easy enough to still fix before we merge:
- Segfault on source indexes out of range
- Leftover
yulUtilityFileName()
, probably broken - Go back to
jsonParseStrict()
These are less important issues and improvements that would be fine done later, in follow-up PRs:
- No output when import is successful.
- Code snippets should not be printed, only code locations.
- Restore test coverage for correct subassembly order
- Compiler produces empty JSON but cannot import it
- Storing all possible subpaths is an overkill
- Strings allocated by asm import are never freed
ASTImportTest.sh
rename
And these are general issues that I pointed out, but probably don't have a good, quick solution:
1bae0e1
to
e33f75b
Compare
if (!jumpType.empty()) | ||
{ | ||
if (item.instruction() == Instruction::JUMP || item.instruction() == Instruction::JUMPI) | ||
{ | ||
std::optional<AssemblyItem::JumpType> parsedJumpType = AssemblyItem::parseJumpType(jumpType); | ||
if (!parsedJumpType.has_value()) | ||
solThrow(AssemblyImportException, "Invalid jump type."); | ||
item.setJumpType(parsedJumpType.value()); | ||
} | ||
else | ||
solThrow( | ||
AssemblyImportException, | ||
"Member 'jumpType' set on instruction different from JUMP or JUMPI (was set on instruction '" + name + "')" | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!jumpType.empty()) | |
{ | |
if (item.instruction() == Instruction::JUMP || item.instruction() == Instruction::JUMPI) | |
{ | |
std::optional<AssemblyItem::JumpType> parsedJumpType = AssemblyItem::parseJumpType(jumpType); | |
if (!parsedJumpType.has_value()) | |
solThrow(AssemblyImportException, "Invalid jump type."); | |
item.setJumpType(parsedJumpType.value()); | |
} | |
else | |
solThrow( | |
AssemblyImportException, | |
"Member 'jumpType' set on instruction different from JUMP or JUMPI (was set on instruction '" + name + "')" | |
); | |
} | |
if (!jumpType.empty()) | |
if (item.instruction() == Instruction::JUMP || item.instruction() == Instruction::JUMPI) | |
{ | |
std::optional<AssemblyItem::JumpType> parsedJumpType = AssemblyItem::parseJumpType(jumpType); | |
if (!parsedJumpType.has_value()) | |
solThrow(AssemblyImportException, "Invalid jump type."); | |
item.setJumpType(parsedJumpType.value()); | |
} | |
else | |
solThrow( | |
AssemblyImportException, | |
"Member 'jumpType' set on instruction different from JUMP or JUMPI (was set on instruction '" + name + "')" | |
); |
Honestly, I'd be perfectly fine leaving this as is, since it's not checked by check_style.sh
, and we'll eventually move to clang-format
, where we'll have to auto format the whole project anyway. At your discretion then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when clang-format? haha
if (!jumpType.empty()) | ||
{ | ||
if (item.instruction() == Instruction::JUMP || item.instruction() == Instruction::JUMPI) | ||
{ | ||
std::optional<AssemblyItem::JumpType> parsedJumpType = AssemblyItem::parseJumpType(jumpType); | ||
if (!parsedJumpType.has_value()) | ||
solThrow(AssemblyImportException, "Invalid jump type."); | ||
item.setJumpType(parsedJumpType.value()); | ||
} | ||
else | ||
solThrow( | ||
AssemblyImportException, | ||
"Member 'jumpType' set on instruction different from JUMP or JUMPI (was set on instruction '" + name + "')" | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!jumpType.empty()) | |
{ | |
if (item.instruction() == Instruction::JUMP || item.instruction() == Instruction::JUMPI) | |
{ | |
std::optional<AssemblyItem::JumpType> parsedJumpType = AssemblyItem::parseJumpType(jumpType); | |
if (!parsedJumpType.has_value()) | |
solThrow(AssemblyImportException, "Invalid jump type."); | |
item.setJumpType(parsedJumpType.value()); | |
} | |
else | |
solThrow( | |
AssemblyImportException, | |
"Member 'jumpType' set on instruction different from JUMP or JUMPI (was set on instruction '" + name + "')" | |
); | |
} | |
if (!jumpType.empty()) | |
if (item.instruction() == Instruction::JUMP || item.instruction() == Instruction::JUMPI) | |
{ | |
std::optional<AssemblyItem::JumpType> parsedJumpType = AssemblyItem::parseJumpType(jumpType); | |
if (!parsedJumpType.has_value()) | |
solThrow(AssemblyImportException, "Invalid jump type."); | |
item.setJumpType(parsedJumpType.value()); | |
} | |
else | |
solThrow( | |
AssemblyImportException, | |
"Member 'jumpType' set on instruction different from JUMP or JUMPI (was set on instruction '" + name + "')" | |
); |
Again, just pointing out - feel free to ignore (or not if Kamil sees this).
for (Json::Value const& sourceName: _json["sourceList"]) | ||
{ | ||
solRequire( | ||
std::find(parsedSourceList.begin(), parsedSourceList.end(), sourceName.asString()) == parsedSourceList.end(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically just a nested for loop - so a bit wasteful; wouldn't it make more sense to make parsedSourceList
a std::set<std::string>
and use parsedSourceSet.count(sourceName.asString())
instead, and then just copy it to a vector when/if needed, i.e. std::vector<std::string> parsedSourceList(parsedSourceSet.begin(), parsedSourceSet.end())
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that destroy the order though? The order here is very important.
But other than that, yeah, that would be more efficient for long lists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it would, in which case we couldn't simply copy content over from the set into the vector, but could none the less use the same approach (i.e. using a set for a presence check, which technically wouldn't be any less efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case it would be better to first load all items into a vector and then only use the set as a temporary helper to check if items are unique. Even better, wrap that in a reusable isUnique()
helper. Or maybe such a helper even already exists in boost?
But IMO this is also good enough as is, given that the feature is experimental and we gave some other inefficiencies and suboptimal things a pass too. This could be improved to the follow-up refactor PR.
{ | ||
solAssert(dataIter.key().isString()); | ||
std::string dataItemID = dataIter.key().asString(); | ||
Json::Value const& dataItem = data[dataItemID]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit convoluted - can't we just get dataItem
directly from dataIter
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the same thing and apparently we can't. Though I only briefly looked at how the iterator is defined. I didn't see a member for it, but maybe you can find some mechanism for it if you look closer. In any case, I gave up myself on this because there were too many other things to adjust here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dereferencing the iterator should work, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. Feel free to check and suggest a variant that will work :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it works. In any case, regarding your latest comment about priorities - I wouldn't worry about this too much, and I'd prefer not to push to the branch in case @r0qs is actively working on (which I'm assuming he is). So we can leave it for another time, unless of course, he notices this comment and does it himself :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushing should not be a problem as long as you use --force-with-lease
and only push new commits without modifying any existing ones or rebasing the whole branch. Then for @r0qs it would be a simple git rebase -i origin/import-asm-json-updated
(and maybe some conflict resolutions, but this is a small localized change and everyone should be used to resolving conflicts by now anyway :)).
Just to keep priorities clear here - we absolutely need this merged very soon, preferably today so I think @r0qs should focus on the three must-haves I listed in #13673 (review) - unless of course someone finds something serious, like a bug. For the other minor improvements and refactors - feel free to just push fixups directly. Or just leave them be for now to be potentially addressed in some follow-up cleanup PR. I already fixed a lot of these myself while reviewing this last week. All the big, structural problems, like the import being part of |
e4e6f59
to
fa75409
Compare
test/cmdlineTests/asm_json_import_missing_subobjects_indices/stdin
Outdated
Show resolved
Hide resolved
840e7e3
to
9d0a4de
Compare
9d0a4de
to
c3825ed
Compare
Co-authored-by: Alexander Arlt <[email protected]> Co-authored-by: r0qs <[email protected]>
Co-authored-by: Kamil Śliwak <[email protected]> Co-authored-by: r0qs <[email protected]>
c3825ed
to
da10cb9
Compare
Co-authored-by: Kamil Śliwak <[email protected]> Co-authored-by: r0qs <[email protected]>
da10cb9
to
91c7b32
Compare
All the important problems have been addressed.
Target branch isimport-export-script-refactoring
(see #13576).Depends on
#13576(merged),#13577(merged),#13578(merged)and#13579(merged)Closes #11787. Replaces #12834.
TODO
scripts/ASTImportTest.sh
toscripts/ImportExportTest.sh
(will be done as last step before merging)EVM Assembly
andAST
import/export equivalence tests in parallelAssembly.cpp
to account for this comment: Add support to import evm assembly json (updated). #13673 (comment) (Make a separated PR)--asm-json
output in the assembler mode (based on this comment: Add support to import evm assembly json (updated). #13673 (comment))CommandLineInterface::handleCombinedJSON()
to account for: Add support to import evm assembly json (updated). #13673 (comment)