Skip to content

Comments

feat: initial amd gpu support with rocm 7.1#28

Closed
pyqlsa wants to merge 2 commits intoutensils:mainfrom
pyqlsa:rocm-support
Closed

feat: initial amd gpu support with rocm 7.1#28
pyqlsa wants to merge 2 commits intoutensils:mainfrom
pyqlsa:rocm-support

Conversation

@pyqlsa
Copy link
Contributor

@pyqlsa pyqlsa commented Feb 16, 2026

Summary

Intended to address #27 by adding ROCm support for AMD GPUs.

Adds

  • ROCm support for AMD GPUs for the ComfyUI package (via ROCm 7.1); tested gfx1100 with a 7900XTX, and a couple of the default text-to-image and image-to-image template workflows appear to be working as one would expect; there are a few gotchas with ROCm support in general (e.g. xformers), but there look to be known workarounds with tunables nominally available with ComfyUI; CPU architecture compat is only x86_64.
  • ROCm compatible app so we can nix run .#rocm; (tested gfx1100 with a 7900XTX); CPU architecture compat is only x86_64.
  • ROCm compatible container image so we can nix build .#dockerImageROCm; (tested gfx1100 with a 7900XTX running with podman); running with docker not yet tested). CPU architecture compat is only x86_64.
  • Extended options for the ComfyUI nix module to support configuration to leverage the rocm variant (tested on my own system configuration).
  • Updated the CI workflow to build ROCm artifacts (largely just copying the pattern for CUDA artifacts).
  • README updates in line with the above.
  • Another dev shell, just for ROCm, which pulls in the python environment that is ultimately passed to ComfyUI; it was helpful for exploration and a bit of debug.

Changes

  • Changed the cudaSupport toggle to a gpuSupport enum; users (and internal plumbing) now select between cuda, rocm, and none (for cpu-only).
  • Updated the container section of the README to differentiate between the apps that build and load the container vs. just building the container and manually loading with your tooling of choice (either docker or podman).

Observations/Comments

  • Is it worth it to create apps for podman like there are for docker?
  • Is it worth it to also try a spin with ROCm 6.x? (Maybe not, looks like unstable is already on 7.1).
  • I didn't need to pass autoPatchelfIgnoreMissingDeps anything for sox or ffmpeg; I'm not quite sure why that was required for CUDA and not for ROCm.
  • Something in ComfyUI shells out to rocminfo on launch, and I can't figure out yet what it is; I tried adding it as buildInput and propogatedBuildInput in torch, and in the comfyui derivation, but nothing seemed to stick; even without it, ComfyUI starts just fine, but writes some warnings to the logs that it couldn't find rocminfo, but it was able to detect ROCm and GPUs anyways; I threw rocminfo into the container and the systemd unit path to make those warnings go away, regardless.
  • Couldn't seem to figure out how to ignore the test_optim test for timm, to I tried just hucking setuptools in there, and it seemed to work; based on the surrounding comment and ignoring timm's test_kron test, I'm not sure if my choice is an anti-pattern or not.
  • There are still some TODO and XXX comments in there that overlap with some of these thoughts; I'm happy to clean those up before a potential merge, but they're in there for now in case they prompt any specific feedback.

jamesbrink added a commit that referenced this pull request Feb 20, 2026
Cherry-picked pyqlsa's ROCm implementation (commits 7e6f796, e2343fe)
and applied fixes on top:

- Fix ROCm app incorrectly named "cuda" in nix/apps.nix (would shadow CUDA)
- Fix mkPython passing boolean `false` instead of string `"none"` for gpuSupport
- Fix flake description still referencing v0.12.2 instead of v0.14.2
- Fix README Podman ROCm example referencing latest-cuda instead of latest-rocm
- Fix ROCm torchaudio missing FFmpeg/sox ignore deps (matching CUDA pattern)
- Resolve all TODO/XXX review comments left by pyqlsa
- Update CHANGELOG, CLAUDE.md, and README with ROCm documentation
- Fix CHANGELOG footer links (missing v0.14.2 link, Unreleased pointing to v0.12.2)

Closes #27
Co-authored-by: pyqlsa <26353308+pyqlsa@users.noreply.github.com>
@jamesbrink jamesbrink mentioned this pull request Feb 20, 2026
5 tasks
@jamesbrink
Copy link
Member

Thanks @pyqlsa .. sorry i missed this, I cherry picked your commits as i started a version upgrade before I saw this PR. please re-open issue or PR if new release does not work.

#30

@jamesbrink jamesbrink closed this Feb 20, 2026
jamesbrink added a commit that referenced this pull request Feb 20, 2026
* feat: initial amd gpu support with rocm 7.1

* formatting

* fix: resolve bugs and clean up ROCm support from PR #28

Cherry-picked pyqlsa's ROCm implementation (commits 7e6f796, e2343fe)
and applied fixes on top:

- Fix ROCm app incorrectly named "cuda" in nix/apps.nix (would shadow CUDA)
- Fix mkPython passing boolean `false` instead of string `"none"` for gpuSupport
- Fix flake description still referencing v0.12.2 instead of v0.14.2
- Fix README Podman ROCm example referencing latest-cuda instead of latest-rocm
- Fix ROCm torchaudio missing FFmpeg/sox ignore deps (matching CUDA pattern)
- Resolve all TODO/XXX review comments left by pyqlsa
- Update CHANGELOG, CLAUDE.md, and README with ROCm documentation
- Fix CHANGELOG footer links (missing v0.14.2 link, Unreleased pointing to v0.12.2)

Closes #27
Co-authored-by: pyqlsa <26353308+pyqlsa@users.noreply.github.com>

---------

Co-authored-by: pyqlsa <26353308+pyqlsa@users.noreply.github.com>
@jamesbrink
Copy link
Member

Your commits were cherry-picked into #30 (preserving your authorship) with bug fixes applied on top. Merged and released as v0.14.2. Thank you for the ROCm implementation!

@pyqlsa
Copy link
Contributor Author

pyqlsa commented Feb 21, 2026

No worries at all. I'm back to the keyboard and will check this all out. Thank you @jamesbrink !

@pyqlsa pyqlsa deleted the rocm-support branch February 21, 2026 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants