Skip to content

clang-offload-bundler incorrectly errors on multi-CCOB binaries #448

@stellaraccident

Description

@stellaraccident

(The below analysis was done by claude upon discovering that clang-offload-bundler cannot properly unbundle some of our production libraries)

===================================================================
BUG REPORT: clang-offload-bundler Fails on Concatenated CCOB Bundles

Date: 2025-10-30
Reporter: [Your name]
LLVM Project: amd/comgr and clang/tools/clang-offload-bundler

===================================================================
SUMMARY

clang-offload-bundler fails to unbundle files containing multiple
concatenated CCOB (Clang Code Object Bundle) compressed bundles with
error: "Failed to decompress input: Could not decompress embedded file
contents: Src size is incorrect"

This is caused by an incomplete fix in commit efda523 (October 2025)
that fixed llvm/lib/Object/OffloadBundle.cpp but did NOT fix the
duplicate implementation in clang/lib/Driver/OffloadBundler.cpp.

===================================================================
AFFECTED FILES

Real-world ROCm libraries with concatenated CCOB bundles:

  • librocblas.so.5 (64 concatenated CCOBs, 8.2 MB .hip_fatbin section)
  • librocsparse.so (similar structure)
  • Likely other BLAS/sparse libraries

These libraries work fine at runtime (via COMGR) but cannot be
processed by clang-offload-bundler command-line tool.

===================================================================
ROOT CAUSE

Two implementations of CCOB decompression exist with different bugs:

FIXED Implementation (llvm/lib/Object/OffloadBundle.cpp:546):
StringRef CompressedData = Blob.substr(HeaderSize, TotalFileSize - HeaderSize);

✅ Correctly limits decompression to first CCOB's TotalFileSize
✅ Handles concatenated CCOBs properly

BUGGY Implementation (clang/lib/Driver/OffloadBundler.cpp:1270):
StringRef CompressedData = Blob.substr(HeaderSize);

❌ Reads from HeaderSize to END of buffer
❌ When buffer contains multiple concatenated CCOBs, includes all of them
❌ Zstd decompressor tries to decompress beyond first bundle boundary
❌ Encounters second CCOB header mid-stream, causes corruption/error

===================================================================
REPRODUCTION

Test File: librocblas.so.5 from ROCm 6.x distribution
.hip_fatbin section: 8,163,887 bytes containing 64 concatenated CCOBs

Structure:
Offset 0x0: CCOB header + 1.16 MB compressed (→ 12.41 MB uncompressed)
Offset 0x129000: CCOB header + 1.01 MB compressed (→ 13.14 MB uncompressed)
Offset 0x227000: CCOB header + 36.5 KB compressed (→ 1.21 MB uncompressed)
... (61 more bundles)

Command:
$ clang-offload-bundler --type=o --input=librocblas.so.5 --list

Error:
clang-offload-bundler: error: Failed to decompress input: Could not
decompress embedded file contents: Src size is incorrect

Expected:
Should list all target triples in the bundle, or at minimum process
the first bundle without error.

===================================================================
WHY COMGR WORKS BUT BUNDLER FAILS

COMGR succeeds because the CLR runtime pre-isolates bundles:

  1. CLR reads first CCOB header's TotalFileSize field (hip_fatbin.cpp:275)
  2. CLR extracts exactly TotalFileSize bytes (one CCOB)
  3. CLR passes isolated bundle to COMGR via amd_comgr_do_action()
  4. COMGR writes to temporary file and calls UnbundleFiles()
  5. Even though UnbundleFiles uses buggy decompressor, there's no
    extra data to cause corruption

clang-offload-bundler fails because:

  1. Tool loads entire .hip_fatbin section (all 64 CCOBs)
  2. Calls decompress() on full buffer
  3. Buggy decompressor reads from HeaderSize to END
  4. Zstd tries to decompress 8+ MB thinking it's one compressed stream
  5. Encounters second CCOB's "CCOB" magic bytes mid-stream
  6. Decompression validation fails

===================================================================
GIT HISTORY

Commit: efda523 (October 2025)
Title: "Fix compress/decompress in LLVM Offloading API (llvm#150064)"

Changes in llvm/lib/Object/OffloadBundle.cpp:

  • StringRef CompressedData = Blob.substr(CurrentOffset);
  • StringRef CompressedData = Blob.substr(HeaderSize, TotalFileSize - HeaderSize);

This fix was applied to llvm/lib/Object/OffloadBundle.cpp but NOT to
the duplicate implementation in clang/lib/Driver/OffloadBundler.cpp.

===================================================================
FIX

Apply the same fix from efda523 to clang/lib/Driver/OffloadBundler.cpp:

File: clang/lib/Driver/OffloadBundler.cpp
Line: ~1270 (in CompressedOffloadBundle::decompress method)

Current code:
StringRef CompressedData = Blob.substr(HeaderSize);

Fixed code:
StringRef CompressedData = Blob.substr(HeaderSize, TotalFileSize - HeaderSize);

This makes the bundler correctly limit decompression to the first CCOB's
declared size, enabling proper handling of concatenated bundles.

===================================================================
ADDITIONAL CONTEXT

The llvm/lib/Object/OffloadBundle.cpp:33-99 extractOffloadBundle()
function shows the intended behavior for concatenated CCOBs:

  1. Iterate through buffer searching for CCOB magic markers
  2. Extract one CCOB at a time using take_front(NextbundleStart)
  3. Decompress isolated bundle
  4. Advance offset to next bundle
  5. Repeat until end of buffer

This loop-based extraction relies on decompress() correctly respecting
TotalFileSize to avoid reading past the current bundle.

===================================================================
TEST VERIFICATION

After applying fix, verify with:

$ clang-offload-bundler --type=o --input=librocblas.so.5 --list

Expected output: List of target triples (or at minimum no error)

Test with known working file:
$ clang-offload-bundler --type=o --input=librccl.so.1.0 --list

Should continue to work (single CCOB bundle).

===================================================================
AFFECTED CODE PATHS

clang/lib/Driver/OffloadBundler.cpp:

  • Line 1233-1329: CompressedOffloadBundle::decompress() [BUGGY]
  • Line 1534-1678: OffloadBundler::UnbundleFiles() [calls decompress]

llvm/lib/Object/OffloadBundle.cpp:

  • Line 509-601: CompressedOffloadBundle::decompress() [FIXED]
  • Line 33-99: extractOffloadBundle() [handles multiple CCOBs]

amd/comgr/src/comgr-compiler.cpp:

  • Line 1318-1436: AMDGPUCompiler::unbundle() [uses clang bundler]

===================================================================
REFERENCES

LLVM Repository: /home/stella/workspace/llvm-project

Key files:

  • clang/lib/Driver/OffloadBundler.cpp (needs fix)
  • llvm/lib/Object/OffloadBundle.cpp (already fixed)
  • clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp (CLI tool)
  • clang/include/clang/Driver/OffloadBundler.h (CCOB format defs)

Related CLR runtime code: /home/stella/workspace/rocm-systems/projects/clr

  • hipamd/src/hip_fatbin.cpp:252-523 (runtime bundle loading)
  • hipamd/src/hip_code_object.hpp:44-81 (CCOB format structures)

===================================================================
PRIORITY

Medium-High: Affects tooling for analyzing/repackaging ROCm libraries.
The runtime works fine, but command-line unbundling is broken for
production ROCm libraries like librocblas and librocsparse.

===================================================================

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions