Skip to content

Conversation

@slojosic-amd
Copy link
Collaborator

This PR should give opportunity for testing performances with bigger context sizes after disabling GGML_HIP_ROCWMMA_FATTN
It should help to fix issues like this one: #36

@Goldenkoron
Copy link

Please merge this, rocm has been close to unusable for awhile now with strix halo and I want to use newer models that are not compatible with the older working versions.

@slojosic-amd
Copy link
Collaborator Author

@Goldenkoron did you confirm with https://github.com/lemonade-sdk/llamacpp-rocm/actions/runs/20528450015/artifacts/4972554015 that disabling GGML_HIP_ROCWMMA_FATTN gives you better perf numbers for #36

@Goldenkoron
Copy link

@Goldenkoron did you confirm with https://github.com/lemonade-sdk/llamacpp-rocm/actions/runs/20528450015/artifacts/4972554015 that disabling GGML_HIP_ROCWMMA_FATTN gives you better perf numbers for #36

Sorry I didn't see a windows release was sent. I'll test this later today when I'm off work and report back.

@danielholanda
Copy link
Contributor

Thanks @slojosic-amd and @Goldenkoron. Please provide some numbers that shows that this indeed solves the problem and I will be happy to merge this.

@danielholanda
Copy link
Contributor

Confirmed as discussed in #36

@danielholanda danielholanda merged commit f40a8fa into main Feb 3, 2026
25 of 26 checks passed
@slojosic-amd slojosic-amd deleted the disable_rocwmma_fattn branch February 3, 2026 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants