Set up Ghidra integration tests#327
Conversation
|
Nice! About the original binaries in CI: is the Ghidra project under I haven't looked at running CI stuff in a container very much. Actually, I think the last time I did, I ran into this same problem: no way to prepare files in some location to be volume mounted into the container. You'd think other people would have run into this problem too. |
It was more of a performance consideration. It takes quite a while to import and analyze a new Ghidra project, and I didn't want to do that every time I run the tests (there's quite a bit of logic for handling data that's already there, so importing without analyzing might not be sufficiently representative of the real world). Since the Ghidra version is pinned in the CI script, incompatibilities should only occur when upgrading the version manually. What would be your preferred way of handling this? Cache the project locally after the first import? |
937f131 to
41c4630
Compare
I guess we could do that. I wasn't thinking about the performance angle. It would all depend on how long it takes to analyze once, and that may be slow enough. (Given that our playwright tests take ~2 minutes.) My preference to not embed binary files is just that we can't audit the (potential) changes in any real way. But we are sort of stuck because these are the only relevant samples for our tool. The alternative would be to mock the communication with Ghidra's API and not use a real project at all. How feasible is that? |
I'll do a benchmark to have a basis for the decision. Will move the PR back to draft until then.
I specifically wanted a real Ghidra instance for the tests because there is a lot of code handling obscure edge cases, which are not obvious from the API at all. |
Then we'll go with that. If we do end up including the Ghidra files here (i.e. if generating once and caching doesn't work out) would it be better to use a file other than ISLE.EXE since none of the tests depend on its specific contents? |
Any file would do, suggestions are welcome :) I didn't have any good ideas and didn't want to get stuck on this point |
|
I built a proof-of-concept and ran a few benchmarks:
So a non-cached startup takes ~24 seconds, while a cached startup takes ~ 6 seconds. I'll go for the cache approach. I haven't yet looked at the Git cache download issue again, though. |
|
I sort of got the cache to work. According to actions/cache#1361 (comment), the cache operation does not share caches across runners with different containers, so I had to set up a second cache for the Ghidra runs. Still needs some polish before it can be reviewed. |
| _scalar_type_map: dict[CvdumpTypeKey, BuiltIn] = { | ||
| CVInfoTypeEnum.T_VOID: VoidDataType(), | ||
| CVInfoTypeEnum.T_HRESULT: LongDataType(), | ||
| CVInfoTypeEnum.T_CHAR: CharDataType(), | ||
| CVInfoTypeEnum.T_SHORT: ShortDataType(), | ||
| CVInfoTypeEnum.T_LONG: LongDataType(), | ||
| CVInfoTypeEnum.T_QUAD: LongLongDataType(), | ||
| CVInfoTypeEnum.T_UCHAR: UnsignedCharDataType(), | ||
| CVInfoTypeEnum.T_USHORT: UnsignedShortDataType(), | ||
| CVInfoTypeEnum.T_ULONG: UnsignedLongDataType(), | ||
| CVInfoTypeEnum.T_UQUAD: UnsignedLongLongDataType(), | ||
| CVInfoTypeEnum.T_REAL32: FloatDataType(), | ||
| CVInfoTypeEnum.T_REAL64: DoubleDataType(), | ||
| CVInfoTypeEnum.T_RCHAR: CharDataType(), | ||
| CVInfoTypeEnum.T_WCHAR: WideCharDataType(), | ||
| CVInfoTypeEnum.T_INT4: IntegerDataType(), | ||
| CVInfoTypeEnum.T_UINT4: UnsignedIntegerDataType(), | ||
| } |
There was a problem hiding this comment.
I refactored these again, now they also work regardless of whether an analysis has been performed
|
|
||
| # Not exactly sure why this is necessary, but it can't hurt | ||
| GhidraScriptUtil.acquireBundleHostReference() | ||
|
|
There was a problem hiding this comment.
Only required to run an analysis
| mypy==1.16.1 | ||
| types-colorama>=0.4.6 | ||
| ghidra-stubs==11.4 | ||
| ghidra-stubs==12.0.2 |
There was a problem hiding this comment.
Running the tests locally requires Ghidra 12. The import still works with Ghidra 11.4
|
@jonschz Thanks for your hard work to get this running without the embedded project. It looks great! The new type_conversion module is a big improvement. If/when we add more types I expect the process will go smoothly. Not a blocker because we are running fine in the container, but I can't run the tests locally. Did you ever see errors like this from If I call just |
tests/conftest.py
Outdated
| # Revert all side effects of the test that just ran | ||
| transaction.abort() |
There was a problem hiding this comment.
Out of curiosity, would the transaction still revert if the calling function (using the context of the ghidra program) threw an exception?
There was a problem hiding this comment.
I am not entirely sure, but it would be prudent to use try-finally. Good point
I am able to run the tests locally and I do see a bunch of those, but those are red herrings - they don't hinder the test execution, at least on my machine. In general, I found debugging issues to be a bit difficult since pytest tends to swallow logs from errors in fixtures. Do you have the latest version (Ghidra 12.0.4) installed and |
That was it, thanks. I'm always a few versions behind with Ghidra. |
At long last, there are some integration tests for Ghidra. The setup is still very basic, I am planning to build on it in future PRs.
For the binary I used a recompiled
ISLE.EXE, though the test I wrote so far does not depend on the binary. We can still change that, it was the first thing to come to my mind.I couldn't get the setup for the original binaries to work in GitHub actions, so I disabled the tests for now.
I also plan to do transactions and rollback in the fixture once I have tests that actually make meaningful modifications.Edit: doneFurthermore, I refactored importing scalars, which made one exception type obsolete.