Skip to content

Conversation

@dummy24
Copy link

@dummy24 dummy24 commented Nov 26, 2025

This commit adds support for dyld chained fixups for arm64 macOS binaries. This is essential for correctly loading and executing binaries on newer versions of macOS.

The main changes include:

  • Mach-O Parser:

    • Added parsing for the LC_DYLD_CHAINED_FIXUPS load command.
    • Implemented new data structures to represent chained fixups information, including DyldChainedHeader, DyldChainedStartsInSegment, and DyldChainedImport.
  • Mach-O Loader:

    • Added the dyld_chained_fixups method in QlLoaderMACHO to process and apply chained fixups during loading.
    • This method resolves pointer chains, rebases addresses, and resolves imported symbols.
  • macOS Emulation:

    • Added the hook_imports method in QlOsMacos to intercept and handle calls to imported symbols resolved via chained fixups.
    • This allows providing custom implementations for these imported functions via set_api.
  • Constants:

    • Added constants related to chained fixups in macho_parser/const.py.

These changes significantly enhance Qiling's emulation capabilities on arm64 macOS targets.

Checklist

Which kind of PR do you create?

  • This PR only contains minor fixes.
  • This PR contains major feature update.
  • This PR introduces a new function/api for Qiling Framework.

Coding convention?

  • The new code conforms to Qiling Framework naming convention.
  • The imports are arranged properly.
  • Essential comments are added.
  • The reference of the new code is pointed out.

Extra tests?

  • No extra tests are needed for this PR.
  • I have added enough tests for this PR.
  • Tests will be added after some discussion and review.

Changelog?

  • This PR doesn't need to update Changelog.
  • Changelog will be updated after some proper review.
  • Changelog has been updated in my PR.

Target branch?

  • The target branch is dev branch.

One last thing


This commit adds support for dyld chained fixups for arm64 macOS binaries. This is essential for correctly loading and executing binaries on newer versions of macOS.

The main changes include:

- **Mach-O Parser**:
  - Added parsing for the `LC_DYLD_CHAINED_FIXUPS` load command.
  - Implemented new data structures to represent chained fixups information, including `DyldChainedHeader`, `DyldChainedStartsInSegment`, and `DyldChainedImport`.

- **Mach-O Loader**:
  - Added the `dyld_chained_fixups` method in `QlLoaderMACHO` to process and apply chained fixups during loading.
  - This method resolves pointer chains, rebases addresses, and resolves imported symbols.

- **macOS Emulation**:
  - Added the `hook_imports` method in `QlOsMacos` to intercept and handle calls to imported symbols resolved via chained fixups.
  - This allows providing custom implementations for these imported functions via `set_api`.

- **Constants**:
  - Added constants related to chained fixups in `macho_parser/const.py`.

These changes significantly enhance Qiling's emulation capabilities on arm64 macOS targets.
Copy link
Member

@elicn elicn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! MacOS indeed needs a good refresh.
Please see my comments and suggestions.

while not done:
target_offset = 0
if pointer_format == DYLD_CHAINED_PTR_64:
value = self.ql.unpack64(self.ql.mem.read(chain_cursor_ptr, 8))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of ql.unpack(ql.mem.read(...)) use the ql.mem.read_ptr method.

high8 = (value >> 36) & 0xFF
next_stride = (value >> 51) & 0xFFF
is_bind = (value >> 63) & 0x1 == 1
if is_bind is False: target_offset = target | (high8 << 36)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid writing the True clause on the same line with the if.

Comment on lines +468 to +470
self.ql.mem.write(chain_cursor_ptr, self.ql.pack32(corrected_addr))
else:
self.ql.mem.write(chain_cursor_ptr, self.ql.pack64(corrected_addr))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of ql.mem.write(ql.pack(...)) use the ql.mem.write_ptr method.

if self.ql.arch.type == QL_ARCH.X8664:
load_commpage(self.ql)

if depth == 0 and self.is_driver is False and self.ql.arch.type == QL_ARCH.ARM64:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing is False or is True is not conventional.
Consider just using if self.is_driver or if not self.is_driver

def __str__(self):
pass

class DyldChainedHeader:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the classes available at qiling.os.struct to work with "C structures".
It is heavily documented and there are plenty of examples around the code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion to use the classes from qiling.os.struct.

I noticed that data.py currently does not utilize qiling.os.struct. My main concern is about consistency: Should I refactor the existing code to align with this approach, or should I maintain the current implementation style within data.py?

Furthermore, the current parsing logic operates using file offsets/pointers (or stream positions) rather than memory pointers. For example, calculating the offset is necessary to locate the next structure in the file. Using a memory-centric utility like qiling.os.struct might not be directly applicable or could introduce unnecessary complexity for file-based parsing.

Could you provide further guidance on how to best handle this disparity?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a valid question.
Everything around MacOS emulation kind of froze in time and did not get refactored alongside the other components. This is why you can still see such relics of old practices. You can browse through Linux / ELF files to get the general idea of how a fresh code should look like.

I am not sure I am entirely following on the other comment about pointers and structures. The base classes available in os.struct are very powerful in terms of code readability and efficiency. You may use them in the exact same way as you would initialize a C structure and write it entirely to memory, or read data from memory and "cast" it into a C structure, making its members easily readable.

If there are more specific questions, hit me up on Telegram, that would be much quicker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants