-
Notifications
You must be signed in to change notification settings - Fork 75
Description
(The below analysis was done by claude upon discovering that clang-offload-bundler cannot properly unbundle some of our production libraries)
===================================================================
BUG REPORT: clang-offload-bundler Fails on Concatenated CCOB Bundles
Date: 2025-10-30
Reporter: [Your name]
LLVM Project: amd/comgr and clang/tools/clang-offload-bundler
===================================================================
SUMMARY
clang-offload-bundler fails to unbundle files containing multiple
concatenated CCOB (Clang Code Object Bundle) compressed bundles with
error: "Failed to decompress input: Could not decompress embedded file
contents: Src size is incorrect"
This is caused by an incomplete fix in commit efda523 (October 2025)
that fixed llvm/lib/Object/OffloadBundle.cpp but did NOT fix the
duplicate implementation in clang/lib/Driver/OffloadBundler.cpp.
===================================================================
AFFECTED FILES
Real-world ROCm libraries with concatenated CCOB bundles:
- librocblas.so.5 (64 concatenated CCOBs, 8.2 MB .hip_fatbin section)
- librocsparse.so (similar structure)
- Likely other BLAS/sparse libraries
These libraries work fine at runtime (via COMGR) but cannot be
processed by clang-offload-bundler command-line tool.
===================================================================
ROOT CAUSE
Two implementations of CCOB decompression exist with different bugs:
FIXED Implementation (llvm/lib/Object/OffloadBundle.cpp:546):
StringRef CompressedData = Blob.substr(HeaderSize, TotalFileSize - HeaderSize);
✅ Correctly limits decompression to first CCOB's TotalFileSize
✅ Handles concatenated CCOBs properly
BUGGY Implementation (clang/lib/Driver/OffloadBundler.cpp:1270):
StringRef CompressedData = Blob.substr(HeaderSize);
❌ Reads from HeaderSize to END of buffer
❌ When buffer contains multiple concatenated CCOBs, includes all of them
❌ Zstd decompressor tries to decompress beyond first bundle boundary
❌ Encounters second CCOB header mid-stream, causes corruption/error
===================================================================
REPRODUCTION
Test File: librocblas.so.5 from ROCm 6.x distribution
.hip_fatbin section: 8,163,887 bytes containing 64 concatenated CCOBs
Structure:
Offset 0x0: CCOB header + 1.16 MB compressed (→ 12.41 MB uncompressed)
Offset 0x129000: CCOB header + 1.01 MB compressed (→ 13.14 MB uncompressed)
Offset 0x227000: CCOB header + 36.5 KB compressed (→ 1.21 MB uncompressed)
... (61 more bundles)
Command:
$ clang-offload-bundler --type=o --input=librocblas.so.5 --list
Error:
clang-offload-bundler: error: Failed to decompress input: Could not
decompress embedded file contents: Src size is incorrect
Expected:
Should list all target triples in the bundle, or at minimum process
the first bundle without error.
===================================================================
WHY COMGR WORKS BUT BUNDLER FAILS
COMGR succeeds because the CLR runtime pre-isolates bundles:
- CLR reads first CCOB header's TotalFileSize field (hip_fatbin.cpp:275)
- CLR extracts exactly TotalFileSize bytes (one CCOB)
- CLR passes isolated bundle to COMGR via amd_comgr_do_action()
- COMGR writes to temporary file and calls UnbundleFiles()
- Even though UnbundleFiles uses buggy decompressor, there's no
extra data to cause corruption
clang-offload-bundler fails because:
- Tool loads entire .hip_fatbin section (all 64 CCOBs)
- Calls decompress() on full buffer
- Buggy decompressor reads from HeaderSize to END
- Zstd tries to decompress 8+ MB thinking it's one compressed stream
- Encounters second CCOB's "CCOB" magic bytes mid-stream
- Decompression validation fails
===================================================================
GIT HISTORY
Commit: efda523 (October 2025)
Title: "Fix compress/decompress in LLVM Offloading API (llvm#150064)"
Changes in llvm/lib/Object/OffloadBundle.cpp:
- StringRef CompressedData = Blob.substr(CurrentOffset);
- StringRef CompressedData = Blob.substr(HeaderSize, TotalFileSize - HeaderSize);
This fix was applied to llvm/lib/Object/OffloadBundle.cpp but NOT to
the duplicate implementation in clang/lib/Driver/OffloadBundler.cpp.
===================================================================
FIX
Apply the same fix from efda523 to clang/lib/Driver/OffloadBundler.cpp:
File: clang/lib/Driver/OffloadBundler.cpp
Line: ~1270 (in CompressedOffloadBundle::decompress method)
Current code:
StringRef CompressedData = Blob.substr(HeaderSize);
Fixed code:
StringRef CompressedData = Blob.substr(HeaderSize, TotalFileSize - HeaderSize);
This makes the bundler correctly limit decompression to the first CCOB's
declared size, enabling proper handling of concatenated bundles.
===================================================================
ADDITIONAL CONTEXT
The llvm/lib/Object/OffloadBundle.cpp:33-99 extractOffloadBundle()
function shows the intended behavior for concatenated CCOBs:
- Iterate through buffer searching for CCOB magic markers
- Extract one CCOB at a time using take_front(NextbundleStart)
- Decompress isolated bundle
- Advance offset to next bundle
- Repeat until end of buffer
This loop-based extraction relies on decompress() correctly respecting
TotalFileSize to avoid reading past the current bundle.
===================================================================
TEST VERIFICATION
After applying fix, verify with:
$ clang-offload-bundler --type=o --input=librocblas.so.5 --list
Expected output: List of target triples (or at minimum no error)
Test with known working file:
$ clang-offload-bundler --type=o --input=librccl.so.1.0 --list
Should continue to work (single CCOB bundle).
===================================================================
AFFECTED CODE PATHS
clang/lib/Driver/OffloadBundler.cpp:
- Line 1233-1329: CompressedOffloadBundle::decompress() [BUGGY]
- Line 1534-1678: OffloadBundler::UnbundleFiles() [calls decompress]
llvm/lib/Object/OffloadBundle.cpp:
- Line 509-601: CompressedOffloadBundle::decompress() [FIXED]
- Line 33-99: extractOffloadBundle() [handles multiple CCOBs]
amd/comgr/src/comgr-compiler.cpp:
- Line 1318-1436: AMDGPUCompiler::unbundle() [uses clang bundler]
===================================================================
REFERENCES
LLVM Repository: /home/stella/workspace/llvm-project
Key files:
- clang/lib/Driver/OffloadBundler.cpp (needs fix)
- llvm/lib/Object/OffloadBundle.cpp (already fixed)
- clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp (CLI tool)
- clang/include/clang/Driver/OffloadBundler.h (CCOB format defs)
Related CLR runtime code: /home/stella/workspace/rocm-systems/projects/clr
- hipamd/src/hip_fatbin.cpp:252-523 (runtime bundle loading)
- hipamd/src/hip_code_object.hpp:44-81 (CCOB format structures)
===================================================================
PRIORITY
Medium-High: Affects tooling for analyzing/repackaging ROCm libraries.
The runtime works fine, but command-line unbundling is broken for
production ROCm libraries like librocblas and librocsparse.
===================================================================