Set up Ghidra integration tests by jonschz · Pull Request #327 · isledecomp/reccmp

jonschz · 2026-02-28T15:14:57Z

At long last, there are some integration tests for Ghidra. The setup is still very basic, I am planning to build on it in future PRs.

For the binary I used a recompiled ISLE.EXE, though the test I wrote so far does not depend on the binary. We can still change that, it was the first thing to come to my mind.

I couldn't get the setup for the original binaries to work in GitHub actions, so I disabled the tests for now.

~~I also plan to do transactions and rollback in the fixture once I have tests that actually make meaningful modifications.~~ Edit: done

Furthermore, I refactored importing scalars, which made one exception type obsolete.

reccmp/cvdump/cvinfo.py

reccmp/ghidra/importer/ghidra_helper.py

disinvite · 2026-02-28T21:47:32Z

Nice! About the original binaries in CI: is the Ghidra project under tests/ghidra there because the download inside the container didn't work? i.e. was your original idea to start a new Ghidra project using one of the files for each test? I think it would be better to create the project on the fly so then we don't have to worry about Ghidra version compatibility in the future.

I haven't looked at running CI stuff in a container very much. Actually, I think the last time I did, I ran into this same problem: no way to prepare files in some location to be volume mounted into the container. You'd think other people would have run into this problem too.

jonschz · 2026-03-01T07:24:33Z

I think it would be better to create the project on the fly so then we don't have to worry about Ghidra version compatibility in the future.

It was more of a performance consideration. It takes quite a while to import and analyze a new Ghidra project, and I didn't want to do that every time I run the tests (there's quite a bit of logic for handling data that's already there, so importing without analyzing might not be sufficiently representative of the real world). Since the Ghidra version is pinned in the CI script, incompatibilities should only occur when upgrading the version manually.

What would be your preferred way of handling this? Cache the project locally after the first import?

disinvite · 2026-03-01T17:42:29Z

What would be your preferred way of handling this? Cache the project locally after the first import?

I guess we could do that. I wasn't thinking about the performance angle. It would all depend on how long it takes to analyze once, and that may be slow enough. (Given that our playwright tests take ~2 minutes.)

My preference to not embed binary files is just that we can't audit the (potential) changes in any real way. But we are sort of stuck because these are the only relevant samples for our tool.

The alternative would be to mock the communication with Ghidra's API and not use a real project at all. How feasible is that?

jonschz · 2026-03-01T18:50:38Z

I guess we could do that. I wasn't thinking about the performance angle. It would all depend on how long it takes to analyze once, and that may be slow enough.

I'll do a benchmark to have a basis for the decision. Will move the PR back to draft until then.

The alternative would be to mock the communication with Ghidra's API and not use a real project at all. How feasible is that?

I specifically wanted a real Ghidra instance for the tests because there is a lot of code handling obscure edge cases, which are not obvious from the API at all.

disinvite · 2026-03-01T21:36:06Z

I specifically wanted a real Ghidra instance for the tests because there is a lot of code handling obscure edge cases, which are not obvious from the API at all.

Then we'll go with that. If we do end up including the Ghidra files here (i.e. if generating once and caching doesn't work out) would it be better to use a file other than ISLE.EXE since none of the tests depend on its specific contents?

jonschz · 2026-03-02T15:20:40Z

would it be better to use a file other than ISLE.EXE since none of the tests depend on its specific contents?

Any file would do, suggestions are welcome :) I didn't have any good ideas and didn't want to get stuck on this point

jonschz · 2026-03-08T10:52:44Z

I built a proof-of-concept and ran a few benchmarks:

Ghidra startup (always required): ~ 3.5 seconds
Loading an existing project: ~ 2.5 seconds
Creating a new project, importing a binary (without analysis), and saving: ~ 7 seconds
Analyzing ISLE.EXE: ~ 13 seconds

So a non-cached startup takes ~24 seconds, while a cached startup takes ~ 6 seconds. I'll go for the cache approach. I haven't yet looked at the Git cache download issue again, though.

jonschz · 2026-03-08T14:14:11Z

I sort of got the cache to work. According to actions/cache#1361 (comment), the cache operation does not share caches across runners with different containers, so I had to set up a second cache for the Ghidra runs. Still needs some polish before it can be reviewed.

jonschz · 2026-03-13T19:01:56Z

reccmp/ghidra/importer/type_conversion.py

+_scalar_type_map: dict[CvdumpTypeKey, BuiltIn] = {
+    CVInfoTypeEnum.T_VOID: VoidDataType(),
+    CVInfoTypeEnum.T_HRESULT: LongDataType(),
+    CVInfoTypeEnum.T_CHAR: CharDataType(),
+    CVInfoTypeEnum.T_SHORT: ShortDataType(),
+    CVInfoTypeEnum.T_LONG: LongDataType(),
+    CVInfoTypeEnum.T_QUAD: LongLongDataType(),
+    CVInfoTypeEnum.T_UCHAR: UnsignedCharDataType(),
+    CVInfoTypeEnum.T_USHORT: UnsignedShortDataType(),
+    CVInfoTypeEnum.T_ULONG: UnsignedLongDataType(),
+    CVInfoTypeEnum.T_UQUAD: UnsignedLongLongDataType(),
+    CVInfoTypeEnum.T_REAL32: FloatDataType(),
+    CVInfoTypeEnum.T_REAL64: DoubleDataType(),
+    CVInfoTypeEnum.T_RCHAR: CharDataType(),
+    CVInfoTypeEnum.T_WCHAR: WideCharDataType(),
+    CVInfoTypeEnum.T_INT4: IntegerDataType(),
+    CVInfoTypeEnum.T_UINT4: UnsignedIntegerDataType(),
+}


I refactored these again, now they also work regardless of whether an analysis has been performed

jonschz · 2026-03-13T19:02:31Z

reccmp/ghidra/ghidra_import.py


-        # Not exactly sure why this is necessary, but it can't hurt
-        GhidraScriptUtil.acquireBundleHostReference()
-


Only required to run an analysis

jonschz · 2026-03-13T19:04:01Z

requirements-tests.txt

 mypy==1.16.1
 types-colorama>=0.4.6
-ghidra-stubs==11.4
+ghidra-stubs==12.0.2


Running the tests locally requires Ghidra 12. The import still works with Ghidra 11.4

disinvite · 2026-03-14T14:59:09Z

@jonschz Thanks for your hard work to get this running without the embedded project. It looks great!

The new type_conversion module is a big improvement. If/when we add more types I expect the process will go smoothly.

Not a blocker because we are running fine in the container, but I can't run the tests locally. Did you ever see errors like this from pytest?

Windows fatal exception: access violation

Current thread 0x0000126c (most recent call first):
  File "reccmp\.venv\Lib\site-packages\jpype\_core.py", line 357 in startJVM
  File "reccmp\.venv\Lib\site-packages\pyghidra\launcher.py", line 426 in _setup_java
  File "reccmp\.venv\Lib\site-packages\pyghidra\launcher.py", line 510 in start
  File "reccmp\tests\ghidra_integration_test_setup.py", line 46 in ghidra_integration_test_program

If I call just HeadlessPyGhidraLauncher().start() from a python shell then it works. I have the JAVA_HOME and GHIDRA_INSTALL_DIR variables set correctly. I've never had a problem running the import scripts.

tests/conftest.py

disinvite · 2026-03-14T15:02:33Z

tests/conftest.py

+    # Revert all side effects of the test that just ran
+    transaction.abort()


Out of curiosity, would the transaction still revert if the calling function (using the context of the ghidra program) threw an exception?

I am not entirely sure, but it would be prudent to use try-finally. Good point

jonschz · 2026-03-14T20:28:29Z

Windows fatal exception: access violation

I am able to run the tests locally and I do see a bunch of those, but those are red herrings - they don't hinder the test execution, at least on my machine. In general, I found debugging issues to be a bit difficult since pytest tends to swallow logs from errors in fixtures.

Do you have the latest version (Ghidra 12.0.4) installed and GHIDRA_INSTALL_DIR pointed to it? Some of the test setup requires Ghidra 12 because the relevant Ghidra 11 API was deprecated and I didn't want to support both.

disinvite · 2026-03-14T21:37:05Z

Do you have the latest version (Ghidra 12.0.4) installed and GHIDRA_INSTALL_DIR pointed to it? Some of the test setup requires Ghidra 12 because the relevant Ghidra 11 API was deprecated and I didn't want to support both.

That was it, thanks. I'm always a few versions behind with Ghidra.

jonschz commented Feb 28, 2026

View reviewed changes

reccmp/cvdump/cvinfo.py Show resolved Hide resolved

reccmp/ghidra/importer/ghidra_helper.py Outdated Show resolved Hide resolved

jonschz added 20 commits March 1, 2026 10:44

Add barebones Ghidra project with recompiled ISLE

2808319

Add simple Ghidra test

6ccae67

WIP: experimental container

021306e

Container experiment 02

8a34c4c

Attempt 03

35e8d2a

Install pip

8901f08

add venv

9e783bf

Try other image and apt

b182c9a

fix apt-get

a57d477

add python3-venv

1e39396

Restore stock unit test, separate Ghidra test

ad8b994

Try cache restore again

9640e51

try restore-keys

d890fbe

Disable binfiles, Python experiment

8f89310

Cleanup, let's see if it works

99c2b83

Refactor to fixture

9a9dc81

Restore other unit tests

9bbea1e

Write proper test for scalars, clean up scalar import

5816eb8

Cleanup

9927601

Cleanup

41c4630

jonschz force-pushed the ghidra-integration-tests branch from 937f131 to 41c4630 Compare March 1, 2026 09:45

Add Ghidra test isolation

04d1cb5

jonschz marked this pull request as draft March 1, 2026 18:50

jonschz added 6 commits March 8, 2026 13:58

Mostly working draft without embedded Ghidar project

dd2d272

Cleanup, fix scalar import

b8d4947

Git cache restore experiment 01

476f59a

git cache restore experiment 02

7ded8aa

git cache restore experiment 03

f209760

git cache restore experiment 04

05766bf

Structural refactor, make linters happy

76e280d

jonschz commented Mar 13, 2026

View reviewed changes

jonschz and others added 4 commits March 13, 2026 20:30

Try to fix CI failure, more cleanup

6e66cfb

Next fix attempt, clean up pipeline

e757aae

Fix oversights

8bae707

Merge branch 'master' into ghidra-integration-tests

c32fa19

jonschz marked this pull request as ready for review March 13, 2026 19:48

jonschz requested a review from disinvite March 13, 2026 19:48

Refactor, cleanup

87a59ac

disinvite approved these changes Mar 14, 2026

View reviewed changes

disinvite reviewed Mar 14, 2026

View reviewed changes

tests/conftest.py Outdated Show resolved Hide resolved

disinvite reviewed Mar 14, 2026

View reviewed changes

Fix fallback procedure behaviour

78a6b0a

Address review comments

aaa1586

jonschz merged commit 794b91f into isledecomp:master Mar 14, 2026
18 checks passed

jonschz deleted the ghidra-integration-tests branch March 14, 2026 21:06

jonschz linked an issue Mar 14, 2026 that may be closed by this pull request

Consider adding integration tests for Ghidra import #219

Closed


		# Not exactly sure why this is necessary, but it can't hurt
		GhidraScriptUtil.acquireBundleHostReference()

		# Revert all side effects of the test that just ran
		transaction.abort()

Conversation

jonschz commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

disinvite commented Feb 28, 2026

Uh oh!

jonschz commented Mar 1, 2026

Uh oh!

disinvite commented Mar 1, 2026

Uh oh!

jonschz commented Mar 1, 2026

Uh oh!

disinvite commented Mar 1, 2026

Uh oh!

jonschz commented Mar 2, 2026

Uh oh!

jonschz commented Mar 8, 2026

Uh oh!

jonschz commented Mar 8, 2026

Uh oh!

jonschz Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

jonschz Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

jonschz Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

disinvite commented Mar 14, 2026

Uh oh!

Uh oh!

disinvite Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

jonschz Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

jonschz commented Mar 14, 2026

Uh oh!

Uh oh!

disinvite commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jonschz commented Feb 28, 2026 •

edited

Loading