-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Summary
The ElfSurgery.add_section() implementation has significantly more overhead than objcopy --add-section when adding sections to ELF binaries.
Details
When adding a small section (e.g., .rocm_kpack_ref marker with ~56 bytes of content):
- objcopy: Adds ~120 bytes total overhead
- ElfSurgery.add_section(): Adds ~2,300 bytes overhead
This causes small binaries to grow after kpack transformation when the overhead exceeds the savings from zero-paging .hip_fatbin.
Root Cause
The current implementation:
- Always appends section content at file end
- Moves
.shstrtabto end of file to append the new section name - Rewrites all section headers at end of file
This wastes space because:
- Original
.shstrtablocation becomes dead space - Original section header table location becomes dead space
- May require additional padding for mmap alignment in
map_section_to_load()
Suggested Optimizations
- Extend
.shstrtabin place if there's room (common in padding regions) - Extend section header table in place if possible
- Reuse padding regions for new section content instead of always appending
- Pre-calculate mmap-aligned offset in
add_section()to avoid duplication inmap_section_to_load()
Impact
- Small test binaries may grow slightly after kpack transformation
- Production binaries with large
.hip_fatbinsections (MBs) are unaffected - savings far exceed overhead - Functional correctness is not impacted
Workaround
None needed for production use. The overhead is only noticeable for edge cases with very small .hip_fatbin sections where zero-page savings are minimal.
Metadata
Metadata
Assignees
Labels
No labels