Skip to content

[HIP] compressed bundle format defaults to v3 #152600

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 8, 2025
Merged

Conversation

yxsamliu
Copy link
Collaborator

@yxsamliu yxsamliu commented Aug 7, 2025

HIP runtime support for compressed bundle format v3 is in place, therefore switch the default compressed bundle format to v3 in compiler.

This allows both compressed and decompressed fat binary size to exceed 4GB by default.

Environment variable COMPRESSED_BUNDLE_FORMAT_VERSION=2 can be used for backward compatibility for older HIP runtimes not supporting v3.

Fixes: SWDEV-548879

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Aug 7, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 7, 2025

@llvm/pr-subscribers-clang

Author: Yaxun (Sam) Liu (yxsamliu)

Changes

HIP runtime support for compressed bundle format v3 is in place, therefore switch the default compressed bundle format to v3 in compiler.

This allows both compressed and decompressed fat binary size to exceed 4GB by default.

Environment variable COMPRESSED_BUNDLE_FORMAT_VERSION=2 can be used for backward compatibility for older HIP runtimes not supporting v3.

Fixes: SWDEV-548879


Full diff: https://github.com/llvm/llvm-project/pull/152600.diff

4 Files Affected:

  • (modified) clang/docs/ClangOffloadBundler.rst (+4-4)
  • (modified) clang/include/clang/Driver/OffloadBundler.h (+1-1)
  • (modified) clang/test/Driver/clang-offload-bundler-zlib.c (+23)
  • (modified) clang/test/Driver/clang-offload-bundler-zstd.c (+2-2)
diff --git a/clang/docs/ClangOffloadBundler.rst b/clang/docs/ClangOffloadBundler.rst
index 62cf1642a03a3..5570dbb08ab9a 100644
--- a/clang/docs/ClangOffloadBundler.rst
+++ b/clang/docs/ClangOffloadBundler.rst
@@ -525,15 +525,15 @@ The compressed offload bundle begins with a header followed by the compressed bi
     This is a unique identifier to distinguish compressed offload bundles. The value is the string 'CCOB' (Compressed Clang Offload Bundle).
 
 - **Version Number (16-bit unsigned int)**:
-    This denotes the version of the compressed offload bundle format. The current version is `2`.
+    This denotes the version of the compressed offload bundle format. The current version is `3`.
 
 - **Compression Method (16-bit unsigned int)**:
     This field indicates the compression method used. The value corresponds to either `zlib` or `zstd`, represented as a 16-bit unsigned integer cast from the LLVM compression enumeration.
 
-- **Total File Size (32-bit unsigned int)**:
+- **Total File Size (unsigned int, 32-bit in v2, 64-bit in v3)**:
     This is the total size (in bytes) of the file, including the header. Available in version 2 and above.
 
-- **Uncompressed Binary Size (32-bit unsigned int)**:
+- **Uncompressed Binary Size (unsigned int, 32-bit in v2, 64-bit in v3)**:
     This is the size (in bytes) of the binary data before it was compressed.
 
 - **Hash (64-bit unsigned int)**:
@@ -542,4 +542,4 @@ The compressed offload bundle begins with a header followed by the compressed bi
 - **Compressed Data**:
     The actual compressed binary data follows the header. Its size can be inferred from the total size of the file minus the header size.
 
-    > **Note**: Version 3 of the format is under development. It uses 64-bit fields for Total File Size and Uncompressed Binary Size to support files larger than 4GB. To experiment with version 3, set the environment variable `COMPRESSED_BUNDLE_FORMAT_VERSION=3`. This support is experimental and not recommended for production use.
+    > **Note**: Version 3 is now the default format. For backward compatibility with older HIP runtimes that support version 2 only, set the environment variable `COMPRESSED_BUNDLE_FORMAT_VERSION=2`.
diff --git a/clang/include/clang/Driver/OffloadBundler.h b/clang/include/clang/Driver/OffloadBundler.h
index 667156a524b79..e7306ce3cc9ab 100644
--- a/clang/include/clang/Driver/OffloadBundler.h
+++ b/clang/include/clang/Driver/OffloadBundler.h
@@ -120,7 +120,7 @@ class CompressedOffloadBundle {
     static llvm::Expected<CompressedBundleHeader> tryParse(llvm::StringRef);
   };
 
-  static inline const uint16_t DefaultVersion = 2;
+  static inline const uint16_t DefaultVersion = 3;
 
   static llvm::Expected<std::unique_ptr<llvm::MemoryBuffer>>
   compress(llvm::compression::Params P, const llvm::MemoryBuffer &Input,
diff --git a/clang/test/Driver/clang-offload-bundler-zlib.c b/clang/test/Driver/clang-offload-bundler-zlib.c
index b026e2ec99877..643af989e5670 100644
--- a/clang/test/Driver/clang-offload-bundler-zlib.c
+++ b/clang/test/Driver/clang-offload-bundler-zlib.c
@@ -66,6 +66,29 @@
 // NOHOST-V3-DAG: hip-amdgcn-amd-amdhsa--gfx900
 // NOHOST-V3-DAG: hip-amdgcn-amd-amdhsa--gfx906
 
+// Check compression/decompression of offload bundle using version 2 format.
+//
+// RUN: env OFFLOAD_BUNDLER_COMPRESS=1 OFFLOAD_BUNDLER_VERBOSE=1 COMPRESSED_BUNDLE_FORMAT_VERSION=2 \
+// RUN:   clang-offload-bundler -type=bc -targets=hip-amdgcn-amd-amdhsa--gfx900,hip-amdgcn-amd-amdhsa--gfx906 \
+// RUN:   -input=%t.tgt1 -input=%t.tgt2 -output=%t.hip.bundle.bc 2>&1 | // RUN:   FileCheck -check-prefix=COMPRESS-V2 %s
+// RUN: clang-offload-bundler -type=bc -list -input=%t.hip.bundle.bc | FileCheck -check-prefix=NOHOST-V2 %s
+// RUN: env OFFLOAD_BUNDLER_VERBOSE=1 \
+// RUN:   clang-offload-bundler -type=bc -targets=hip-amdgcn-amd-amdhsa--gfx900,hip-amdgcn-amd-amdhsa--gfx906 \
+// RUN:   -output=%t.res.tgt1 -output=%t.res.tgt2 -input=%t.hip.bundle.bc -unbundle 2>&1 | 
+// RUN:   FileCheck -check-prefix=DECOMPRESS-V2 %s
+// RUN: diff %t.tgt1 %t.res.tgt1
+// RUN: diff %t.tgt2 %t.res.tgt2
+//
+// COMPRESS-V2: Compressed bundle format version: 2
+// COMPRESS-V2: Compression method used: zlib
+// COMPRESS-V2: Compression level: 6
+// DECOMPRESS-V2: Compressed bundle format version: 2
+// DECOMPRESS-V2: Decompression method: zlib
+// DECOMPRESS-V2: Hashes match: Yes
+// NOHOST-V2-NOT: host-
+// NOHOST-V2-DAG: hip-amdgcn-amd-amdhsa--gfx900
+// NOHOST-V2-DAG: hip-amdgcn-amd-amdhsa--gfx906
+
 // Check -compression-level= option
 
 // RUN: clang-offload-bundler -type=bc -targets=hip-amdgcn-amd-amdhsa--gfx900,hip-amdgcn-amd-amdhsa--gfx906 \
diff --git a/clang/test/Driver/clang-offload-bundler-zstd.c b/clang/test/Driver/clang-offload-bundler-zstd.c
index 667d9554daec7..c1123ae5acb38 100644
--- a/clang/test/Driver/clang-offload-bundler-zstd.c
+++ b/clang/test/Driver/clang-offload-bundler-zstd.c
@@ -29,11 +29,11 @@
 // RUN: diff %t.tgt1 %t.res.tgt1
 // RUN: diff %t.tgt2 %t.res.tgt2
 //
-// CHECK: Compressed bundle format version: 2
+// CHECK: Compressed bundle format version: 3
 // CHECK: Total file size (including headers): [[SIZE:[0-9]*]] bytes
 // CHECK: Compression method used: zstd
 // CHECK: Compression level: 3
-// CHECK: Compressed bundle format version: 2
+// CHECK: Compressed bundle format version: 3
 // CHECK: Total file size (from header): [[SIZE]] bytes
 // CHECK: Decompression method: zstd
 // CHECK: Hashes match: Yes

@llvmbot
Copy link
Member

llvmbot commented Aug 7, 2025

@llvm/pr-subscribers-clang-driver

Author: Yaxun (Sam) Liu (yxsamliu)

Changes

HIP runtime support for compressed bundle format v3 is in place, therefore switch the default compressed bundle format to v3 in compiler.

This allows both compressed and decompressed fat binary size to exceed 4GB by default.

Environment variable COMPRESSED_BUNDLE_FORMAT_VERSION=2 can be used for backward compatibility for older HIP runtimes not supporting v3.

Fixes: SWDEV-548879


Full diff: https://github.com/llvm/llvm-project/pull/152600.diff

4 Files Affected:

  • (modified) clang/docs/ClangOffloadBundler.rst (+4-4)
  • (modified) clang/include/clang/Driver/OffloadBundler.h (+1-1)
  • (modified) clang/test/Driver/clang-offload-bundler-zlib.c (+23)
  • (modified) clang/test/Driver/clang-offload-bundler-zstd.c (+2-2)
diff --git a/clang/docs/ClangOffloadBundler.rst b/clang/docs/ClangOffloadBundler.rst
index 62cf1642a03a3..5570dbb08ab9a 100644
--- a/clang/docs/ClangOffloadBundler.rst
+++ b/clang/docs/ClangOffloadBundler.rst
@@ -525,15 +525,15 @@ The compressed offload bundle begins with a header followed by the compressed bi
     This is a unique identifier to distinguish compressed offload bundles. The value is the string 'CCOB' (Compressed Clang Offload Bundle).
 
 - **Version Number (16-bit unsigned int)**:
-    This denotes the version of the compressed offload bundle format. The current version is `2`.
+    This denotes the version of the compressed offload bundle format. The current version is `3`.
 
 - **Compression Method (16-bit unsigned int)**:
     This field indicates the compression method used. The value corresponds to either `zlib` or `zstd`, represented as a 16-bit unsigned integer cast from the LLVM compression enumeration.
 
-- **Total File Size (32-bit unsigned int)**:
+- **Total File Size (unsigned int, 32-bit in v2, 64-bit in v3)**:
     This is the total size (in bytes) of the file, including the header. Available in version 2 and above.
 
-- **Uncompressed Binary Size (32-bit unsigned int)**:
+- **Uncompressed Binary Size (unsigned int, 32-bit in v2, 64-bit in v3)**:
     This is the size (in bytes) of the binary data before it was compressed.
 
 - **Hash (64-bit unsigned int)**:
@@ -542,4 +542,4 @@ The compressed offload bundle begins with a header followed by the compressed bi
 - **Compressed Data**:
     The actual compressed binary data follows the header. Its size can be inferred from the total size of the file minus the header size.
 
-    > **Note**: Version 3 of the format is under development. It uses 64-bit fields for Total File Size and Uncompressed Binary Size to support files larger than 4GB. To experiment with version 3, set the environment variable `COMPRESSED_BUNDLE_FORMAT_VERSION=3`. This support is experimental and not recommended for production use.
+    > **Note**: Version 3 is now the default format. For backward compatibility with older HIP runtimes that support version 2 only, set the environment variable `COMPRESSED_BUNDLE_FORMAT_VERSION=2`.
diff --git a/clang/include/clang/Driver/OffloadBundler.h b/clang/include/clang/Driver/OffloadBundler.h
index 667156a524b79..e7306ce3cc9ab 100644
--- a/clang/include/clang/Driver/OffloadBundler.h
+++ b/clang/include/clang/Driver/OffloadBundler.h
@@ -120,7 +120,7 @@ class CompressedOffloadBundle {
     static llvm::Expected<CompressedBundleHeader> tryParse(llvm::StringRef);
   };
 
-  static inline const uint16_t DefaultVersion = 2;
+  static inline const uint16_t DefaultVersion = 3;
 
   static llvm::Expected<std::unique_ptr<llvm::MemoryBuffer>>
   compress(llvm::compression::Params P, const llvm::MemoryBuffer &Input,
diff --git a/clang/test/Driver/clang-offload-bundler-zlib.c b/clang/test/Driver/clang-offload-bundler-zlib.c
index b026e2ec99877..643af989e5670 100644
--- a/clang/test/Driver/clang-offload-bundler-zlib.c
+++ b/clang/test/Driver/clang-offload-bundler-zlib.c
@@ -66,6 +66,29 @@
 // NOHOST-V3-DAG: hip-amdgcn-amd-amdhsa--gfx900
 // NOHOST-V3-DAG: hip-amdgcn-amd-amdhsa--gfx906
 
+// Check compression/decompression of offload bundle using version 2 format.
+//
+// RUN: env OFFLOAD_BUNDLER_COMPRESS=1 OFFLOAD_BUNDLER_VERBOSE=1 COMPRESSED_BUNDLE_FORMAT_VERSION=2 \
+// RUN:   clang-offload-bundler -type=bc -targets=hip-amdgcn-amd-amdhsa--gfx900,hip-amdgcn-amd-amdhsa--gfx906 \
+// RUN:   -input=%t.tgt1 -input=%t.tgt2 -output=%t.hip.bundle.bc 2>&1 | // RUN:   FileCheck -check-prefix=COMPRESS-V2 %s
+// RUN: clang-offload-bundler -type=bc -list -input=%t.hip.bundle.bc | FileCheck -check-prefix=NOHOST-V2 %s
+// RUN: env OFFLOAD_BUNDLER_VERBOSE=1 \
+// RUN:   clang-offload-bundler -type=bc -targets=hip-amdgcn-amd-amdhsa--gfx900,hip-amdgcn-amd-amdhsa--gfx906 \
+// RUN:   -output=%t.res.tgt1 -output=%t.res.tgt2 -input=%t.hip.bundle.bc -unbundle 2>&1 | 
+// RUN:   FileCheck -check-prefix=DECOMPRESS-V2 %s
+// RUN: diff %t.tgt1 %t.res.tgt1
+// RUN: diff %t.tgt2 %t.res.tgt2
+//
+// COMPRESS-V2: Compressed bundle format version: 2
+// COMPRESS-V2: Compression method used: zlib
+// COMPRESS-V2: Compression level: 6
+// DECOMPRESS-V2: Compressed bundle format version: 2
+// DECOMPRESS-V2: Decompression method: zlib
+// DECOMPRESS-V2: Hashes match: Yes
+// NOHOST-V2-NOT: host-
+// NOHOST-V2-DAG: hip-amdgcn-amd-amdhsa--gfx900
+// NOHOST-V2-DAG: hip-amdgcn-amd-amdhsa--gfx906
+
 // Check -compression-level= option
 
 // RUN: clang-offload-bundler -type=bc -targets=hip-amdgcn-amd-amdhsa--gfx900,hip-amdgcn-amd-amdhsa--gfx906 \
diff --git a/clang/test/Driver/clang-offload-bundler-zstd.c b/clang/test/Driver/clang-offload-bundler-zstd.c
index 667d9554daec7..c1123ae5acb38 100644
--- a/clang/test/Driver/clang-offload-bundler-zstd.c
+++ b/clang/test/Driver/clang-offload-bundler-zstd.c
@@ -29,11 +29,11 @@
 // RUN: diff %t.tgt1 %t.res.tgt1
 // RUN: diff %t.tgt2 %t.res.tgt2
 //
-// CHECK: Compressed bundle format version: 2
+// CHECK: Compressed bundle format version: 3
 // CHECK: Total file size (including headers): [[SIZE:[0-9]*]] bytes
 // CHECK: Compression method used: zstd
 // CHECK: Compression level: 3
-// CHECK: Compressed bundle format version: 2
+// CHECK: Compressed bundle format version: 3
 // CHECK: Total file size (from header): [[SIZE]] bytes
 // CHECK: Decompression method: zstd
 // CHECK: Hashes match: Yes

Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random question, what would compression look like for the offload binaries? I'm wondering how difficult it would be to switch HIP to using those instead of the bundles for its binary format.

@yxsamliu
Copy link
Collaborator Author

yxsamliu commented Aug 8, 2025

Random question, what would compression look like for the offload binaries? I'm wondering how difficult it would be to switch HIP to using those instead of the bundles for its binary format.

There has been efforts to compress arbitrary ELF sections (https://maskray.me/blog/2023-07-07-compressed-arbitrary-sections), however, it may take time. Also, it may not apply to COFF on Windows. To achieve high compression ratio, using zstd to compress all code objects together is critical, therefore need a format that aggregates all code objects and compress them together.

HIP runtime support for compressed bundle format v3 is in place,
therefore switch the default compressed bundle format to v3
in compiler.

This allows both compressed and decompressed fat binary size
to exceed 4GB by default.

Environment variable COMPRESSED_BUNDLE_FORMAT_VERSION=2 can
be used for backward compatibility for older HIP runtimes
not supporting v3.

Fixes: SWDEV-548879
@yxsamliu yxsamliu merged commit 479556c into llvm:main Aug 8, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants