Skip to content

Conversation

@christiangnrd
Copy link
Member

@christiangnrd christiangnrd commented Jun 27, 2025

Not sure how the tests didn't catch this. num_chunks was determined under the assumption that each thread processes 1 element instead of 2. Combined with a typo, it was causing out-of-bounds access errors on backends with working error reporting.

@christiangnrd christiangnrd changed the title Add accumulate tests Fix for accumulate by block Jun 27, 2025
@christiangnrd christiangnrd marked this pull request as ready for review June 27, 2025 17:49
@christiangnrd
Copy link
Member Author

@anicusan This is probably worth a patch release so I bumped it already, but it you disagree I can revert it.

@giordano
Copy link

I can confirm this fixes for me on AMD MI300 the memory issue mentioned at #46 (comment).

@vchuravy vchuravy merged commit 06c2594 into JuliaGPU:main Jul 1, 2025
38 checks passed
@christiangnrd christiangnrd deleted the err branch July 1, 2025 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants