Skip to content

Add alternative emulation through unicorn engine#57

Open
ks0777 wants to merge 28 commits intoFraunhofer-AISEC:masterfrom
ks0777:unicorn
Open

Add alternative emulation through unicorn engine#57
ks0777 wants to merge 28 commits intoFraunhofer-AISEC:masterfrom
ks0777:unicorn

Conversation

@ks0777
Copy link
Contributor

@ks0777 ks0777 commented Mar 14, 2023

This PR adds the unicorn emulation engine to archie. It can be enabled by passing the --unicorn flag to the controller. With this option enabled, archie will emulate the experiments with the unicorn engine instead of QEMU. The (pre-)goldenrun is still emulated with QEMU which is necessary in order to obtain a state from which we can start our experiments from.

The alternative emulation mode was implemented in a seperate Rust library and integrated into archie with minimal changes to the faultclass and controller scripts. This allows for easy reuse of the filtering and processing functions that have previously been implemented for the experiment data returned by QEMU.

@ks0777 ks0777 force-pushed the unicorn branch 6 times, most recently from 64ab813 to 8d197d3 Compare March 23, 2023 11:39
@ks0777
Copy link
Contributor Author

ks0777 commented Apr 21, 2023

Waiting for unicorn-engine/unicorn#1812 to be merged. We need the additional Rust bindings in order to clear the tb cache after modifying instructions during a fault

@ks0777 ks0777 force-pushed the unicorn branch 3 times, most recently from 9e429da to 12a47c2 Compare April 25, 2023 14:39
@e-shreve-ti
Copy link
Contributor

This is a nice addition! If I may, here is a suggestion to increase the flexibility of the solution toward supporting additional emulation engine support (even beyond qemu and unicorn):

  • Instead of a --unicorn argument to controller.py, do the following:
    • Either change the --qemu option to --emu now or add an --emu option as an alternative to --qemu (they would do the same thing). The idea of adding --emu without replacing --qemu would be to deprecate --qemu toward only having --emu in a future update.
    • The json file passed to --emu could then provide either a "qemu" or a "unicorn" member. The value of those members would be the path to the emulator binary (just as the "qemu" member is today.) However, if "qemu" member is provided then qemu is used, if "unicorn" member is provided then unicorn is used. Instead of passing the new bool value of unicorn_emulation to controller(), the controller() function can then just look for the "unicorn" member and set its internal boolean unicorn_emulation based on that. This also avoids the confusion for users of passing the path to unicorn in the "qemu" member.
    • Update the "additional_qemu_args" member of the config.json to be called "additional_args", this will be clearer to users long-term.

The idea here is that if an additional emulation engine is added in the future, then the framework is already setup for that. The controller() function would just be modified to look for a new "otheremulatorname" member in the config file and take actions as needed there. It also means users don't have to match command line parameters to controller.py with the contents of the emulation JSON file-all the settings are in the JSON file.

@ks0777
Copy link
Contributor Author

ks0777 commented May 3, 2023

Thanks for your suggestion! Including, the choice of emulation engine into the configuration files sounds a like a good idea. Note that when using Unicorn, the pre-goldenrun is still performed using QEMU in order to ensure compatibility with more firmware since Unicorn can not handle any hardware related functions. Unlike QEMU, the unicorn engine is invoked through a Python module which wraps a Rust library. The required data for the initialization of Unicorn is retrieved from the pre-goldenrun. Hence, there is no need to supply additional arguments or a path to the emulator binary for Unicorn. This may change in the future if we ever decide to add addtional emulation engines, but for now these arguments only apply to QEMU.

@ks0777 ks0777 force-pushed the unicorn branch 2 times, most recently from 539dd18 to 9808021 Compare October 23, 2025 15:16
make
cd emulation_worker
cargo build --release
cp target/release/libemulation_worker.so ../emulation_worker.so
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the advantage of this, can we just leave it in the build dir?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't directly import the .so if its directory is not in the PATH. An alternative would be to add the build directory to the path before importing the .so in faultclass.py. This is easily done with 4 lines of code but overall not really clean either imo. With this solution, however, there would be no more copies of the .so file which might be less confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, both solutions are not perfect. Maybe the best would be to adopt the same approach we are currently using for the faultplugin. That would mean not copying the library and instead adding a new config entry to qemuconf.json for specifying its location.

goldenrun.py Outdated
return [config_qemu["max_instruction_count"], experiment["data"], faultconfig]
return [
config_qemu["max_instruction_count"],
experiments[0]["data"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks if we do not have a pregolden run (if no start address is specified in the config). Not sure what the best way to handle this is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(pre-)goldenrun experiment results are now stored in distinct variables instead of a list with varying size. If no pre-goldenrun was performed None is returned.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good. Is it a problem for the unicorn system if we do not have a memory dump from the pregoldenrun in configs without start address?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a problem, yes. We would either have to implement a pure unicorn mode that does not require the bootstrapping through QEMU or print an error and abort when unicorn emulation is requested without a start address. Is there a reason why you would not want to specify a start address? In the hybrid QEMU+Unicorn mode specifying a start address will always improve performance since the state is restored from the start address instead of fully emulating each run from the beginning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the only use case would be faulting the very first executed instruction. I think at least for now it makes sense to only print an error if no start address is specified in Unicorn mode.

g_string_append_printf(out, "Current Version of QEMU Plugin is %i, Min Version is %i\n", info->version.cur, info->version.min);
architecture = malloc(strlen(info->target_name)+1);
if (!architecture) return -1;
strcpy(architecture, info->target_name);

Check failure

Code scanning / Flawfinder

Does not check for buffer overflows when copying to destination [MS-banned] (CWE-120). Error

buffer/strcpy:Does not check for buffer overflows when copying to destination [MS-banned] (CWE-120).
g_autoptr(GString) out = g_string_new("");
g_string_printf(out, "QEMU Injection Plugin\n Current Target is %s\n", info->target_name);
g_string_append_printf(out, "Current Version of QEMU Plugin is %i, Min Version is %i\n", info->version.cur, info->version.min);
architecture = malloc(strlen(info->target_name)+1);

Check notice

Code scanning / Flawfinder

Does not handle strings that are not \0-terminated; if given one it may perform an over-read (it could cause a crash if unprotected) (CWE-126). Note

buffer/strlen:Does not handle strings that are not \0-terminated; if given one it may perform an over-read (it could cause a crash if unprotected) (CWE-126).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants