ROCm 6.3 support

I recently tested Chapels AMD GPU support with ROCm 6.3 (not yet officially supported by Chapel). This issue captures the success and failures I had with this.

By editing `util/chplenv/chpl_gpu.py`, I could build and run `make check` with ROCm 6.3.0

<details>

<summary> patch file </summary>

```diff
diff --git a/util/chplenv/chpl_gpu.py b/util/chplenv/chpl_gpu.py
index 41890d8c469..132ef95c1dc 100644
--- a/util/chplenv/chpl_gpu.py
+++ b/util/chplenv/chpl_gpu.py
@@ -541,7 +541,7 @@ def _validate_rocm_version_impl():
     MIN_REQ_VERSION = "5.0"
     MAX_REQ_VERSION = "5.5"
     MIN_ROCM6_REQ_VERSION = "6"
-    MAX_ROCM6_REQ_VERSION = "6.3"
+    MAX_ROCM6_REQ_VERSION = "6.4"
 
     rocm_version = get_sdk_version()
 
```

</details>

As a spot check, I ran the following tests. This were selected based on tests I have seen fail when upgrading ROCm versions without additional effort
- test/gpu/native/jacobi/jacobi.chpl
- test/gpu/native/reduction/basic.chpl
- test/gpu/native/mathOps.chpl
- test/gpu/native/gpuWritelnAndAssertOnGpu.chpl

The only one that failed was `gpuWritelnAndAssertOnGpu` with a segfault, which is a usual suspect for ROCm failures. This test relies on interop and printf/varargs, which frequently triggers edge cases. Note that this is one of the tests that led us to rely on an AMD LLVM for several ROCm versions in the 5.x era.

The caveat to this is that I tested with ROCm 6.3.0 and do not have access to other versions right now (the latest at the time of writing this is 6.3.3)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROCm 6.3 support #26934

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ROCm 6.3 support #26934

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions