Skip to content

Optimize ElfSurgery.add_section() to reduce file size overhead #3

@stellaraccident

Description

@stellaraccident

Summary

The ElfSurgery.add_section() implementation has significantly more overhead than objcopy --add-section when adding sections to ELF binaries.

Details

When adding a small section (e.g., .rocm_kpack_ref marker with ~56 bytes of content):

  • objcopy: Adds ~120 bytes total overhead
  • ElfSurgery.add_section(): Adds ~2,300 bytes overhead

This causes small binaries to grow after kpack transformation when the overhead exceeds the savings from zero-paging .hip_fatbin.

Root Cause

The current implementation:

  1. Always appends section content at file end
  2. Moves .shstrtab to end of file to append the new section name
  3. Rewrites all section headers at end of file

This wastes space because:

  • Original .shstrtab location becomes dead space
  • Original section header table location becomes dead space
  • May require additional padding for mmap alignment in map_section_to_load()

Suggested Optimizations

  1. Extend .shstrtab in place if there's room (common in padding regions)
  2. Extend section header table in place if possible
  3. Reuse padding regions for new section content instead of always appending
  4. Pre-calculate mmap-aligned offset in add_section() to avoid duplication in map_section_to_load()

Impact

  • Small test binaries may grow slightly after kpack transformation
  • Production binaries with large .hip_fatbin sections (MBs) are unaffected - savings far exceed overhead
  • Functional correctness is not impacted

Workaround

None needed for production use. The overhead is only noticeable for edge cases with very small .hip_fatbin sections where zero-page savings are minimal.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions