Skip to content

libevmasm: refactor asm-json export & add support for source list. #12799

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 24, 2022

Conversation

aarlt
Copy link
Member

@aarlt aarlt commented Mar 15, 2022

Needed for #11787.
Replaces part of #12704.

This PR refactors the asm-json export and adds support for source list (e.g. needed for srcmaps during asm-json import).

@aarlt aarlt force-pushed the asm-json-export branch from 69be6bd to 078255f Compare March 15, 2022 21:39
@aarlt aarlt force-pushed the asm-json-export branch 2 times, most recently from 578998d to 9d7cc21 Compare March 16, 2022 17:11
@aarlt aarlt force-pushed the asm-json-export branch from 9d7cc21 to f37f081 Compare March 16, 2022 17:28
@aarlt aarlt force-pushed the asm-json-export branch from f37f081 to 6502a4b Compare March 16, 2022 21:53
@chriseth
Copy link
Contributor

I think this can be even further simplified by creating a function (can be a method of AssemblyItem) that returns pair<string, string> which is the name and the value. The function would then just be

switch (_item.type())
{
case Operation: return {instructionInfo(_item.instruction()).name, ""};
case Push: return {"PUSH", toStringInHex(_item.data())};
...
}

The rest of the values is then added in the loop inside Assembly.cpp and inside this loop, we also add the additional jumpdest for tags.

What do you think?

@@ -39,6 +39,7 @@
#include <sstream>
#include <memory>
#include <map>
#include <utility>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Never knew that C++ had such a catch-all header for basic stuff. I wonder if we should be using it more often instead of including things individually.

Comment on lines 231 to 235
if (_item.m_modifierDepth != 0)
value["modifierDepth"] = static_cast<int>(_item.m_modifierDepth);
if (!_value.empty())
value["value"] = _value;
if (!_jumpType.empty())
value["jumpType"] = _jumpType;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that after your changes, "jumpType" field is no longer being filled. On the other hand you're now filling "modifierDepth". This seems like an API change rather than just a refactor. Is that intentional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I removed jumpType yesterday, because I saw that it is just stored in value (for Operation). Regarding the old code, jumpType is only set for PUSH. Not sure why.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chriseth why is jumpType needed for PUSH?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is it just a typo in the code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a problem with the old code. Yes, jumptype was output for push, but also for operation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you are right, it was stored in the wrong location. Why did nobody ever complain?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess nobody really used it yet. Decision from call: store jump type's always in jumpType node.

Comment on lines 1582 to +1585
}
]
}
}
},
Copy link
Member

@cameel cameel Mar 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the removal of "jumpType" and addition of "modifierDepth" not reflected in this test? I'd expect it to fail after these changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm now also wondering. Yesterday I changed exactly this part. The thing is that in our current tests we have it like this:

        {
          "begin": 77,
          "end": 158,
          "name": "JUMP",
          "source": 0,
          "value": "[in]"
        }

So I saw yesterday, that the jumpType is stored in value - so I just changed that - without really checking the original code. The strange thing is that the original code reads like that there should be an jumpType node. Something seem to be wrong. I will look into this. Thx for that hint!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For modifierDepth it is just needed to be able to import the asm json correctly. Without this information it is not possible to generate the original byte code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that JSON the function creates is not the final form but later goes through some extra processing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean. I think modifierDepth was just forgotten when asm-json was first implemented.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. From your comment I thought that modifierDepth is used internally but then not written out on purpose - so there must be some postprocessing step that removes it (and maybe does other stuff).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just needed to add modifierDepth. It was just missing in the original asm-json implementation. There is no post processing or any other extra processing involved. This information was just missing, and now this information will be exported correctly.

Copy link
Member Author

@aarlt aarlt Mar 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to also write it here in this thread: decision from call: always store jump type in a jumpType node. @cameel thx for making us aware of that!

@aarlt aarlt force-pushed the asm-json-export branch from 6502a4b to 87a0819 Compare March 17, 2022 21:52
default:
assertThrow(false, InvalidOpcode, "");
if (name == "JUMP")
item["value"] = jumpType;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment that this is replicating a bug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced this now with

		if (!jumpType.empty())
			item["jumpType"] = jumpType;

so whenever a jump type was defined, it will be stored in an jumpType node.

The original behaviour for JUMP (and maybe also other operations) was to store it in value. We will now store this in jumpType.

assertThrow(false, InvalidOpcode, "");
if (name == "JUMP")
item["value"] = jumpType;
if (name == "PUSH")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (name == "PUSH")
else

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will store this now always in jumpType.

@@ -56,6 +57,51 @@ pair<size_t, size_t> AssemblyItem::splitForeignPushTag() const
return make_pair(subId, tag);
}

string AssemblyItem::toStringInHex(u256 _value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we already have a utility function for that? In any case, please turn this into a function inside an anonnymous namespace at least.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I was not able to put it in exactly the needed hex format with the other utility functions. Moved to anonymous namespace.

@aarlt aarlt force-pushed the asm-json-export branch from 87a0819 to f09e542 Compare March 21, 2022 14:50
@chriseth
Copy link
Contributor

Since this also fixes a bug now, it needs a changelog entry.

for (auto const& [name, index]: _sourceIndices)
sourceNames[index] = name;
for (auto const& item: sourceNames)
jsonSourceList.append(item);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you make this a json array and then avoid the vector?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual order will be defined with the first for loop. The json array seem not to allow the insertion of items at specific positions. I just use the vector to store the final list of source names, ordered according to its index.

@@ -147,7 +148,8 @@ class Assembly

/// Create a JSON representation of the assembly.
Json::Value assemblyJSON(
std::map<std::string, unsigned> const& _sourceIndices = std::map<std::string, unsigned>()
std::map<std::string, unsigned> const& _sourceIndices = std::map<std::string, unsigned>(),
bool _includeSourceList = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this default to false to retain the old behaviour? Do we need a default value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now I set includeSourceList to false, only if a sub-assembly is exported. In CompilerStack::assemblyJSON(..) we want to have the source list included. If called with default value, the top-level assembly json will always include the source list, but it's sub-assemblies not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this should also be mentioned in the changelog.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats true!

@chriseth
Copy link
Contributor

Can you add a modifier to the commandline test so we can test the modifier depth?

@aarlt aarlt force-pushed the asm-json-export branch 2 times, most recently from 2f386ea to a3d5c9d Compare March 21, 2022 17:42
@aarlt aarlt force-pushed the asm-json-export branch 2 times, most recently from aeab412 to e4acc25 Compare March 21, 2022 22:36
Changelog.md Outdated
@@ -8,6 +8,7 @@ Compiler Features:


Bugfixes:
* Assembly: Fix assembly json export to store jump types of operations in `jumpType` field (instead of `value`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Assembly: Fix assembly json export to store jump types of operations in `jumpType` field (instead of `value`).
* Assembly-Json: Fix assembly json export to store jump types of operations in `jumpType` field instead of `value`.

Comment on lines 273 to 279
root["sourceList"] = Json::arrayValue;
Json::Value& jsonSourceList = root["sourceList"];
vector<string> sourceNames(_sourceIndices.size());
for (auto const& [name, index]: _sourceIndices)
sourceNames[index] = name;
for (auto const& item: sourceNames)
jsonSourceList.append(item);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
root["sourceList"] = Json::arrayValue;
Json::Value& jsonSourceList = root["sourceList"];
vector<string> sourceNames(_sourceIndices.size());
for (auto const& [name, index]: _sourceIndices)
sourceNames[index] = name;
for (auto const& item: sourceNames)
jsonSourceList.append(item);
root["sourceList"] = Json::arrayValue;
for (auto const& [name, index]: _sourceIndices)
root["sourceList"][index] = name;

I'm pretty sure this does it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, great! Somehow I did not check the operators. Thx!

@chriseth
Copy link
Contributor

Only cosmetic changes left.

@aarlt aarlt force-pushed the asm-json-export branch from b9504ee to f3dd7cd Compare March 23, 2022 20:50
@aarlt aarlt force-pushed the asm-json-export branch from f3dd7cd to 1a0988e Compare March 23, 2022 20:58
@chriseth chriseth merged commit b08190c into develop Mar 24, 2022
@chriseth chriseth deleted the asm-json-export branch March 24, 2022 11:13
@Sticknmove
Copy link

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants