Skip to content

Manually load ELF binaries#496

Merged
CvvT merged 19 commits intomicrosoft:mainfrom
jstarks:load
Nov 25, 2025
Merged

Manually load ELF binaries#496
CvvT merged 19 commits intomicrosoft:mainfrom
jstarks:load

Conversation

@jstarks
Copy link
Member

@jstarks jstarks commented Nov 17, 2025

Parse and load ELF binaries manually instead of using the elf_loader crate. This is done for a few reasons:

  • elf_loader is not safe for our use--it assumes it can safely access mapped memory directly via Rust memory access primitives--this is not always true for LiteBox since that memory is user-mode memory. Even when we know there are no other guest threads, so that concurrent mutation should be impossible, we still need to handle that memory differently from Rust memory.

  • elf_loader creates a dependency on shim TLS to get the task object, since the various mmap-related traits do not take &self. This prevents some future refactoring to eliminate the use of TLS within the shims. Fixing this upstream seems like a big change, one that would break semver compatibility.

  • elf_loader seems to combine parsing and mapping operations into one step, preventing us from returning errors from execve when the target binary is not a valid ELF executable.

  • elf_loader has an awful lot of code marked unsafe with safety contracts that are unclear (to me).

  • Loading the binary is a pretty important part of the Linux and OPTEE shims--we want to have a good understanding and control over exactly what loader ABI we are actually implementing.

For now, just update the Linux shim to use this new loader--this work has uncovered some design problems with OPTEE that need to be addressed before we can remove the dependency on elf_loader.

Parse and load ELF binaries manually instead of using the `elf_loader`
crate. This is done for a few reasons:

* `elf_loader` is not safe for our use--it assumes it can safely access
  mapped memory directly via Rust memory access primitives--this is
  not always true for LiteBox since that memory is user-mode memory.
  Even when we know there are no other guest threads, so that concurrent
  mutation should be impossible, we still need to handle that memory
  differently from Rust memory.

* `elf_loader` creates a dependency on shim TLS to get the task object,
  since the various mmap-related traits do not take `&self`. This
  prevents some future refactoring to eliminate the use of TLS within
  the shims. Fixing this upstream seems like a big change, one
  that would break semver compatibility.

* `elf_loader` seems to combine parsing and mapping operations into
  one step, preventing us from returning errors from `execve` when
  the target binary is not a valid ELF executable.

* `elf_loader` has an awful lot of code marked unsafe with unclear
  safety contracts.

* Loading the binary is a pretty important part of the Linux and OPTEE
  shims--we want to have a good understanding and control over exactly
  what loader ABI we are actually implementing.

As part of this, implement relocation support for OPTEE, since before
we were relying on `elf_loader` for this--this is not necessary for
the Linux shim. I am wondering if in the future we can instead inject
a loader binary into OPTEE user mode, similar to the interpreter used
by Linux binaries; then we can run the relocator and whatever else we
want (e.g., TLS allocation) there.

he commit message for your changes. Lines starting
@jstarks jstarks marked this pull request as ready for review November 17, 2025 16:54
Copy link
Member

@wdcui wdcui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding our own elf loader, John! The code looks good to me. I left some minor comments. Please wait for the reviews of @sangho2 and @CvvT before you merge this PR.

@wdcui wdcui requested review from CvvT and sangho2 November 18, 2025 05:06
@sangho2
Copy link
Contributor

sangho2 commented Nov 18, 2025

It seems that this new ELF loader has a problem with the KMPP TA (other three TAs work well). I'll take a look at it more.

cargo run -p litebox_runner_optee_on_linux_userland litebox_runner_optee_on_linux_userland/tests/kmpp-ta.elf.hooked litebox_runner_optee_on_linux_userland/tests/kmpp-ta-cmds.json

@jstarks
Copy link
Member Author

jstarks commented Nov 19, 2025

It seems that this new ELF loader has a problem with the KMPP TA (other three TAs work well). I'll take a look at it more.

cargo run -p litebox_runner_optee_on_linux_userland litebox_runner_optee_on_linux_userland/tests/kmpp-ta.elf.hooked litebox_runner_optee_on_linux_userland/tests/kmpp-ta-cmds.json

Is KMPP TA not part of the CI tests?

@jstarks
Copy link
Member Author

jstarks commented Nov 19, 2025

Hmm, I see, KMPP-TA fails to run but the test succeeds anyway...

@wdcui
Copy link
Member

wdcui commented Nov 25, 2025

@sangho2 @CvvT, if you think this PR is ready to be merged, please approve it. Thanks!

Copy link
Contributor

@CvvT CvvT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks! I left one comment. Feel free to merge once all conflicts are resolved.

@jstarks jstarks enabled auto-merge November 25, 2025 20:59
@github-actions
Copy link

🤖 SemverChecks 🤖 ⚠️ Potential breaking API changes detected ⚠️

Click for details
--- failure enum_missing: pub enum removed or renamed ---

Description:
A publicly-visible enum cannot be imported by its prior path. A `pub use` may have been removed, or the enum itself may have been renamed or removed entirely.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.45.0/src/lints/enum_missing.ron

Failed in:
  enum litebox_shim_linux::loader::ElfLoaderError, previously in file /home/runner/work/litebox/litebox/target/semver-checks/git-main/ba97c2833b325e6735f9be1f2938a269c93edf95/litebox_shim_linux/src/loader/elf.rs:470

--- failure pub_module_level_const_missing: pub module-level const is missing ---

Description:
A public const is missing or renamed
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.45.0/src/lints/pub_module_level_const_missing.ron

Failed in:
  REWRITER_VERSION_NUMBER in file /home/runner/work/litebox/litebox/target/semver-checks/git-main/ba97c2833b325e6735f9be1f2938a269c93edf95/litebox_shim_linux/src/loader/mod.rs:25
  REWRITER_MAGIC_NUMBER in file /home/runner/work/litebox/litebox/target/semver-checks/git-main/ba97c2833b325e6735f9be1f2938a269c93edf95/litebox_shim_linux/src/loader/mod.rs:24

--- failure struct_missing: pub struct removed or renamed ---

Description:
A publicly-visible struct cannot be imported by its prior path. A `pub use` may have been removed, or the struct itself may have been renamed or removed entirely.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.45.0/src/lints/struct_missing.ron

Failed in:
  struct litebox_shim_linux::loader::ElfLoadInfo, previously in file /home/runner/work/litebox/litebox/target/semver-checks/git-main/ba97c2833b325e6735f9be1f2938a269c93edf95/litebox_shim_linux/src/loader/elf.rs:210

@jstarks jstarks added this pull request to the merge queue Nov 25, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 25, 2025
@CvvT
Copy link
Contributor

CvvT commented Nov 25, 2025

The test test_tun_tcp_socket_as_client seems to be flaky; I will take a look.

@CvvT CvvT added this pull request to the merge queue Nov 25, 2025
Merged via the queue into microsoft:main with commit 8883e4f Nov 25, 2025
8 checks passed
@CvvT CvvT deleted the load branch November 25, 2025 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants