Skip to content

llama-cpp: allow to override ROCM GPU targets#435263

Merged
philiptaron merged 1 commit intoNixOS:masterfrom
kurnevsky:llama-rocm-targets
Aug 23, 2025
Merged

llama-cpp: allow to override ROCM GPU targets#435263
philiptaron merged 1 commit intoNixOS:masterfrom
kurnevsky:llama-rocm-targets

Conversation

@kurnevsky
Copy link
Copy Markdown
Member

@kurnevsky kurnevsky commented Aug 20, 2025

It allows to significantly reduce build time in case of building for a single GPU. The same was done for whisper-cpp and llama-cpp flake.

Things done

  • Built on platform:
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • Tested, as applicable:
  • Ran nixpkgs-review on this PR. See nixpkgs-review usage.
  • Tested basic functionality of all binary files, usually in ./result/bin/.
  • Nixpkgs Release Notes
    • Package update: when the change is major or breaking.
  • NixOS Release Notes
    • Module addition: when adding a new NixOS module.
    • Module update: when the change is significant.
  • Fits CONTRIBUTING.md, pkgs/README.md, maintainers/README.md and other READMEs.

Add a 👍 reaction to pull requests you find important.

@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1 This PR causes 1 package to rebuild on Linux. labels Aug 20, 2025
@kurnevsky
Copy link
Copy Markdown
Member Author

nixpkgs-review result

Generated using nixpkgs-review.

Command: nixpkgs-review pr 435263
Commit: f9696ce33be44dfd70ec11293d0cdab439409ddd


x86_64-linux

✅ 1 package built:
  • llama-cpp-rocm

@philiptaron philiptaron requested a review from K900 August 20, 2025 15:40
@philiptaron
Copy link
Copy Markdown
Contributor

Requesting review from known AMD GPU person @K900. Please forward if you know someone better.

@K900
Copy link
Copy Markdown
Contributor

K900 commented Aug 20, 2025

I know nothing about ROCm, you want @LunNova or @Flakebi probably.

@philiptaron
Copy link
Copy Markdown
Contributor

I will merge if they approve!

@LunNova
Copy link
Copy Markdown
Member

LunNova commented Aug 20, 2025

Would it be better to have a single gpuTargets arg that works for CUDA or HIP depending on what's enabled? That's how some other packages like torch do this.

Otherwise seems fine.

@philiptaron
Copy link
Copy Markdown
Contributor

I'd rather not invent an abstraction mechanism like that, but if one already exists and is solid, let's consider using it.

@LunNova
Copy link
Copy Markdown
Member

LunNova commented Aug 20, 2025

torch:

gpuTargetString = strings.concatStringsSep ";" (
if gpuTargets != [ ] then
# If gpuTargets is specified, it always takes priority.
gpuTargets
else if cudaSupport then
gpuArchWarner supportedCudaCapabilities unsupportedCudaCapabilities
else if rocmSupport then
# Remove RDNA1 gfx101x archs from default ROCm support list to avoid
# use of undeclared identifier 'CK_BUFFER_RESOURCE_3RD_DWORD'
# TODO: Retest after ROCm 6.4 or torch 2.8
lib.lists.subtractLists [
"gfx1010"
"gfx1012"
] (rocmPackages.clr.localGpuTargets or rocmPackages.clr.gpuTargets)
else
throw "No GPU targets specified"
);

whisper:

rocmGpuTargets ? builtins.concatStringsSep ";" rocmPackages.clr.gpuTargets,

So we already have fragmentation. No strong feelings.

@kurnevsky
Copy link
Copy Markdown
Member Author

llama-cpp can be built with both ROCM and CUDA enabled at the same time, so it's better not to mix architectures in the same arg.

@kurnevsky
Copy link
Copy Markdown
Member Author

@philiptaron should this be merged then? :)

@philiptaron
Copy link
Copy Markdown
Contributor

I'm looking for the approval ☑️

@nixpkgs-ci nixpkgs-ci bot added the 12.approvals: 1 This PR was reviewed and approved by one person. label Aug 23, 2025
@philiptaron philiptaron merged commit 3c40596 into NixOS:master Aug 23, 2025
31 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-linux: 1 This PR causes 1 package to rebuild on Linux. 12.approvals: 1 This PR was reviewed and approved by one person.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants