Skip to content

Fresh builds of llama.cpp with AMD ROCm™ 7 acceleration for AMDChat

License

Notifications You must be signed in to change notification settings

lemonade-sdk/llamacpp-rocm-dll

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

85 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llamacpp-rocm

GitHub release (latest by date) Latest release date License ROCm 7.0 Powered by llama.cpp Platform: Windows | Ubuntu GPU Targets

We provide nightly builds of llama.cpp with AMD ROCm™ 7 acceleration based on TheRock - delivering the freshest, cutting-edge builds available. Our automated pipeline specifically targets seamless integration with 🍋 Lemonade and similar AI applications requiring high-performance GPU inference.

Important

Contribution & Support Notice: While this project currently focuses on integrating llama.cpp+ROCm in a specific production context, our broader goal is to contribute meaningfully to the llama.cpp+ROCm ecosystem. We're not set up to provide comprehensive technical support, but we welcome collaborations, idea exchanges, or contributions that help advance this space.

🎯 Supported Devices

This build specifically targets the following GPU architectures:

  • gfx1151 (STX Halo GPUs) - Ryzen AI MAX+ Pro 395
  • gfx120X (RDNA4 GPUs) - includes AMD Radeon AI PRO R9700, RX 9070 XT/GRE/9070, RX 9060 XT
  • gfx110X (RDNA3 GPUs) - includes AMD Radeon PRO W7900/W7800/W7700/V710, RX 7900 XTX/XT/GRE, RX 7800 XT, RX 7700 XT

All builds include ROCm™ 7 built-in - no separate ROCm™ installation required!

🚀 Automated Builds

Our automated GitHub Actions workflow creates nightly builds for:

  • Windows and Ubuntu operating systems
  • Multiple GPU targets: gfx1151, gfx120X, gfx110X
  • ROCm™ 7 built-in - complete runtime libraries included
GPU Target Ubuntu Windows
gfx110X Download Ubuntu gfx110X Download Windows gfx110X
gfx1151 Download Ubuntu gfx1151 Download Windows gfx1151
gfx120X Download Ubuntu gfx120X Download Windows gfx120X

⚡ Ready to Run: All releases include complete ROCm™ 7 runtime libraries - just download and go!


🧪 Quick Smoketest

To verify your download is working correctly:

  1. Download the appropriate build for your GPU target from our latest releases
  2. Extract the archive to your preferred directory
  3. Test with any GGUF model from Hugging Face:
llama-server -m YOUR_GGUF_MODEL_PATH -ngl 99

💡 Tip: Use -ngl 99 to offload all layers to GPU for maximum acceleration. The exact number of layers may vary by model, but 99 ensures all available layers are offloaded.

🍋 Lemonade Integration: You can also test these builds directly with Lemonade for a seamless AI application experience (coming soon!)


📦 Dependencies

This project relies on the following external software and tools:

Core Dependencies

  • Llama.cpp - Efficient, cross-platform inference engine for running GGUF models locally.
  • ROCm SDK (TheRock) - AMD’s open-source platform for GPU-accelerated computing.
  • HIP - C++ API for writing portable GPU code within the ROCm ecosystem.

Build Tools & Compilers


🏗️ Code and Artifact Structure

Note

Active Development: This project is under active development. Code and artifact structure are subject to change as we continue to improve and expand functionality.

Key Components

  • docs/ - Contains build documentation and setup guides
  • utils/ - Houses utility scripts for build automation and dependency management
  • GitHub Actions Workflows - Located in .github/workflows/ (automated build pipeline)
  • Build Artifacts - Generated during CI/CD and published as releases

The build process is primarily handled through GitHub Actions, with the repository serving as the source for automated compilation and packaging of llama.cpp with ROCm™ 7 support.


📋 Manual Build Instructions

For detailed manual build instructions, please see: docs/manual_instructions.md

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Fresh builds of llama.cpp with AMD ROCm™ 7 acceleration for AMDChat

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%