Kernel Database Documentation Index

This directory contains comprehensive documentation about ROCm kernel database structures used by rocBLAS and hipBLASLt, with specific focus on how kernels are organized, stored, and can be split for kpack distribution.

Documents

1. KERNEL-DATABASE-SUMMARY.txt (START HERE)

Type: Executive Summary
Size: ~6 KB
Best For: Quick reference, understanding key concepts at a glance

Contents:

Overview and core findings
Size metrics for hipBLASLt and rocBLAS
File types and formats (HSACO, CO, DAT)
Architecture support matrix
Key properties and implications for kpack splitting
Practical installation locations

Read this first if you're new to the kernel databases or need a quick refresher on the essentials.

2. kernel-databases.md (COMPREHENSIVE OVERVIEW)

Type: Detailed Technical Document
Size: ~10 KB
Best For: Understanding the complete architecture and organization

Contents:

Overall size metrics for both libraries
Installation directory structure with examples
Detailed file format analysis (HSACO, CO, DAT)
Architecture-specific organization
File naming conventions and component breakdown
Source tree organization
Database characteristics
Implications for kpack splitting
Size estimates for splitting scenarios

Read this when you need comprehensive understanding of how the databases are structured and organized.

3. kernel-databases-technical-reference.md (DEEP DIVE)

Type: Technical Reference Manual
Size: ~14 KB
Best For: Implementation details, parsing, and splitting strategy

Contents:

Complete file listing and patterns for both libraries
File format specifications with hexdump examples
Architecture-specific variants enumeration
Kernel naming component breakdown
Runtime loading and selection mechanisms
Size and storage implications analysis
Python code examples for parsing DAT files
Detailed kpack splitting strategy with package structure
Dependency chain management
Optimization opportunities
Testing and validation procedures
Programmatic parsing examples

Read this when implementing tools, parsing metadata, or planning the actual split.

Quick Reference

File Type Summary

Type	Format	Size	Purpose
.hsaco	ELF binary	160 KB - 3.7 MB	Compiled GPU kernel archive
.co	ELF binary	40 KB - 590 KB	Individual compiled kernel
.dat	MessagePack	4 KB - 610 KB	Metadata/configuration index

Size at a Glance

hipBLASLt gfx1100:  42 MB  (304 files)
hipBLASLt gfx1201:  320 MB (590 files)
rocBLAS gfx1100:    84 MB  (439 files)
rocBLAS gfx1201:    21 MB

Key Properties

Kernels are architecture-specific - cannot be shared between architectures
Files are paired (.co + .dat) for lazy loading
Few shared files (e.g., hipblasltTransform.hsaco)
Clean separation - no cross-architecture dependencies
Compression friendly - ELF binaries compress 40-50%

Recommended kpack Splitting

rocm-hipblaslt-gfx1100-kernels.kpack  (~14-20 MB)
rocm-hipblaslt-gfx1101-kernels.kpack  (~14-20 MB)
rocm-hipblaslt-gfx1103-kernels.kpack  (~14-20 MB)
rocm-hipblaslt-gfx1201-kernels.kpack  (~150-200 MB)
rocm-hipblaslt-shared.kpack            (~2-3 MB)

rocm-rocblas-gfx1100-kernels.kpack    (~20-30 MB)
rocm-rocblas-gfx1101-kernels.kpack    (~20-30 MB)
rocm-rocblas-gfx1102-kernels.kpack    (~20-30 MB)
rocm-rocblas-gfx1103-kernels.kpack    (~20-30 MB)

Key Findings

1. Architecture Isolation is Absolute

Every kernel in the database includes the target architecture in its filename. This enables:

Perfect separation by architecture
No shared kernels between architectures
Clean unpacking of only needed files

2. Metadata Format is MessagePack

The .dat files use MessagePack binary serialization. This is:

Compact and efficient
Supports lazy loading indicators
Contains all necessary kernel selection information
Parseable with standard msgpack libraries

3. Size Scaling is Linear

Each additional architecture adds:

hipBLASLt: ~42 MB per architecture
rocBLAS: ~84 MB per architecture

But with compression (40-50%), this reduces to:

hipBLASLt: ~21-25 MB per architecture compressed
rocBLAS: ~42-50 MB per architecture compressed

4. File Pairing is Essential

.co and .dat files MUST stay together:

Metadata (.dat) references paths to compiled kernels (.co)
Runtime uses metadata to locate kernel binaries
Splitting them breaks kernel loading

5. Shared Files are Rare

Only identified shared file:

hipblasltTransform.hsaco - used across all architectures
Everything else is architecture-specific

Practical Locations

Installed Libraries (on this system)

/home/stella/claude-rocm-workspace/rocm/gfx1100/lib/hipblaslt/library/
/home/stella/claude-rocm-workspace/rocm/gfx1100/lib/rocblas/library/
/home/stella/claude-rocm-workspace/rocm/gfx1201/lib/hipblaslt/library/
/home/stella/claude-rocm-workspace/rocm/gfx1201/lib/rocblas/library/

Source Trees

/develop/therock/rocm-libraries/projects/hipblaslt/library/src/amd_detail/rocblaslt/src/Tensile/Logic/
/develop/therock/rocm-libraries/projects/rocblas/library/src/

How to Use These Documents

For Quick Understanding

Read KERNEL-DATABASE-SUMMARY.txt (5 min)
Review "Quick Reference" section above

For Implementation

Read KERNEL-DATABASE-SUMMARY.txt (5 min)
Read kernel-databases.md (10 min)
Review kernel-databases-technical-reference.md section on "Splitting Strategy for kpack"

For Deep Technical Understanding

Read all three documents in order
Study file format specifications in technical reference
Review Python parsing examples

For Debugging

Check "File Format Specifications" in technical reference
Use validation procedures in "Testing and Validation" section
Reference "Architecture-Specific Variants" for expected files

Key Takeaways

Kernel databases are architecture-specific - no sharing possible
Files come in paired sets (.co + .dat) - keep together
MessagePack format for metadata - easily parseable
Linear scaling with additional architectures
Compression friendly - ~40-50% reduction possible
Clean separation enables selective unpacking
Few shared files - most are architecture-tagged

Additional Context

These kernel databases are critical for:

Matrix multiplication operations (GEMM) - Tensile-generated kernels
Extended operations (LayerNorm, Softmax, AMax)
Multi-precision support (float, bfloat16, half)
Architecture-specific optimizations

The kpack splitting strategy aims to:

Reduce installation size for single-architecture systems
Enable selective downloading of only needed kernels
Maintain clean separation and modularity
Support future architecture additions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kernel Database Documentation Index

Documents

1. KERNEL-DATABASE-SUMMARY.txt (START HERE)

2. kernel-databases.md (COMPREHENSIVE OVERVIEW)

3. kernel-databases-technical-reference.md (DEEP DIVE)

Quick Reference

File Type Summary

Size at a Glance

Key Properties

Recommended kpack Splitting

Key Findings

1. Architecture Isolation is Absolute

2. Metadata Format is MessagePack

3. Size Scaling is Linear

4. File Pairing is Essential

5. Shared Files are Rare

Practical Locations

Installed Libraries (on this system)

Source Trees

How to Use These Documents

For Quick Understanding

For Implementation

For Deep Technical Understanding

For Debugging

Key Takeaways

Additional Context

FilesExpand file tree

KERNEL-DATABASE-DOCS-INDEX.md

Latest commit

History

KERNEL-DATABASE-DOCS-INDEX.md

File metadata and controls

Kernel Database Documentation Index

Documents

1. KERNEL-DATABASE-SUMMARY.txt (START HERE)

2. kernel-databases.md (COMPREHENSIVE OVERVIEW)

3. kernel-databases-technical-reference.md (DEEP DIVE)

Quick Reference

File Type Summary

Size at a Glance

Key Properties

Recommended kpack Splitting

Key Findings

1. Architecture Isolation is Absolute

2. Metadata Format is MessagePack

3. Size Scaling is Linear

4. File Pairing is Essential

5. Shared Files are Rare

Practical Locations

Installed Libraries (on this system)

Source Trees

How to Use These Documents

For Quick Understanding

For Implementation

For Deep Technical Understanding

For Debugging

Key Takeaways

Additional Context