Enable CK for MIOpen support on gfx110x, gfx120x, and gfx115x#2755
Open
BradPepersAMD wants to merge 45 commits intomainfrom
Open
Enable CK for MIOpen support on gfx110x, gfx120x, and gfx115x#2755BradPepersAMD wants to merge 45 commits intomainfrom
BradPepersAMD wants to merge 45 commits intomainfrom
Conversation
## Motivation Avoid [timeout failures in CI](https://github.com/ROCm/TheRock/actions/runs/20371240849/job/58539352231) nightly. ## Technical Details Reduce the ctest `--parallel` option from 8 to 4 on gfx1153. ## Test Plan 1. w/o retries 1. Locally modify test_rocthrust.py to remove `--repeat until-pass:6` to avoid masking intermittent problems. 2. Reboot the test host. 3. Run the modified test_rocthrust.py a number of times and record its success rate. 2. w/ retries 1. Restore test_rocthrust.py to the version in this PR. 2. Reboot the test host. 3. Run test_rocthrust.py a number of times and record its success rate. ## Test Result 1. test_rocthrust.py w/o retries: passed 5/10 times, but none of the failures was a timeout (they were exceptions/segfaults) 2. test_rocthrust.py w/ retries: passed 10/10 times: 7/10 involved retries, but none were due to timeouts ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
…s for hipDNN customer builds (#2736) ## Motivation hipDNN is a graph-based library that allows consumers to build graphs of operation nodes that are processed and fulfilled on the GPU by engine provider plugins. The architecture is data-agnostic between the frontend and the engine plugins on the backend, which means that consumers can write new frontend nodes and new engine plugins with the hipDNN software in the middle being none the wiser. Part of this adaptability is provided by flatbuffers, which allows a data schema that allows forward compatibility as graphs are serialized, passed through hipDNN, and processed by engine plugins. Since our consuming users can write new graph nodes and new plugins, they must also be able to add new types to the flatbuffer schema and then have them processed from schema .fbs files into header files. Ideally for them, they can just point at a ROCm distribution and all the utilities are available for them to modify to their hearts content. `flatc` is one of the utilities necessary for that. ## Technical Details - Added `flatc` to the build cmake - Added `flatc` to the provided artifacts toml ## Test Plan - Verify that flatc is available in the bin folder - Verify that the appropriate flatbuffer::flatc cmake target is available - Test by using ROCm artifacts for a standalone hipDNN build ## Test Result - [x] - `flatc` binary and cmake targets available - [x] hipDNN builds in standalone (see issue [#3516](ROCm/rocm-libraries#3516)) ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
Create the following symlinks during amdrocm and amdrocm-core installation: /opt/rocm/core -> /opt/rocm/core-<major>.<minor> /opt/rocm/core-<major> -> /opt/rocm/core-<major>.<minor> Also updated a comment in package.json Co-authored-by: raramakr <[email protected]>
…annel docs (#2669) ## Motivation Follow-up to #2392, documenting local version identifiers used for pytorch when building against dev ROCm releases. I also took the opportunity to clarify that each release distribution channel (index page on a hosted domain) contains only release artifacts of the matching release type, except for "dev releases". This detail is important because one can build a release of any type, and then attempt to upload to any release distribution channel (with sufficient permissions). ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
## Motivation While pre-commit should not be needed for building ROCm, it is needed for productive development. Projects like hipDNN have also started logging warnings when it is not installed (see https://github.com/ROCm/rocm-libraries/blob/develop/projects/hipdnn/cmake/CheckPreCommit.cmake) ## Technical Details The minimum version here is not load bearing... I just chose the version I have installed on my own machine. ## Test Plan / Test Result Local testing: ``` λ .\.venv\Scripts\activate.bat λ pip install -r requirements.txt λ pip freeze | grep pre_commit pre_commit==4.5.1 λ which pre-commit /d/projects/TheRock/.venv/Scripts/pre-commit λ pre-commit run --all-files Trim Trailing Whitespace.................................................Passed Fix End of Files.........................................................Passed Check Yaml...............................................................Passed Check for merge conflicts................................................Passed Check for added large files..............................................Passed Mixed line ending........................................................Passed black....................................................................Passed clang-format.............................................................Passed mdformat.................................................................Passed No-tabs checker..........................................................Passed Lint GitHub Actions workflow files.......................................Passed ``` ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
Should fix failing torchaudio builds.
## Motivation In order to fix the get [disabled-test issues](https://github.com/ROCm/TheRock/issues?q=is%3Aissue%20state%3Aopen%20label%3Adisabled-test) down to 0, we are: - including `rocprim` flaky tests fixed - extending the `rocroller` timeout to the maximum 300 mins (a general rule for TheRock - until rocroller team resolves SWDEV-574223 and we can get more appropriate test times) - ignoring consistent failing hipdnn test for gfx950 ## Technical Details Updates to `fetch_test_configurations.py` to extend timeout, adding tests to ignore for gfx950-dcgpu hipdnn and updating `test_rocprim.py` ## Test Plan Testing done in CI workflow dispatch and CI in PR ## Test Result rocroller passes here: [gfx950](https://github.com/ROCm/TheRock/actions/runs/20724154274/job/59519387104?pr=2768), [gfx94X](https://github.com/ROCm/TheRock/actions/runs/20724154274/job/59519386862?pr=2768) rocprim passes here: [gfx950](https://github.com/ROCm/TheRock/actions/runs/20724154274/job/59519387099?pr=2768), [gfx94X](https://github.com/ROCm/TheRock/actions/runs/20724154274/job/59519386885?pr=2768), [gfx1151](https://github.com/ROCm/TheRock/actions/runs/20724154274/job/59519386829?pr=2768) hipdnn passes here: [gfx950](https://github.com/ROCm/TheRock/actions/runs/20724154274/job/59519386900?pr=2768), [gfx1151](https://github.com/ROCm/TheRock/actions/runs/20731732254/job/59521126305) ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests. Closes #2727 Closes #1724 Progress on #2758
Reverts #2730 The IAM role name has been changed and this is no longer required
…2774) Extend the artifact directory generation logic to support gfxarch values defined at the artifact level. Previously, directory suffix selection relied solely on the package-level gfxarch setting, which prevented artifacts from specifying their own architecture requirements. With this update, each artifact can optionally define an Artifact_Gfxarch field. When present, this value is used to determine the directory suffix for that specific artifact. If the field is missing, the logic falls back to the package-level gfxarch setting to ensure consistent behavior. This change improves flexibility for mixed-architecture packages and ensures that artifact-specific overrides are respected without altering the behavior of packages that do not define per-artifact gfxarch data. Co-authored-by: raramakr <[email protected]>
Provides quick-start guidance for developers using Claude Code with TheRock: - Quick setup and build commands - Development workflows (component targets, incremental builds) - Build directory layout and common patterns - Code quality (pre-commit) and style guidelines - Git workflow conventions - Project structure overview - Links to detailed documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude <[email protected]>
- Updated the supported Python versions in RELEASES.md and external-builds/pytorch/README.md to include Python 3.10. - Modified GitHub workflows for portable Linux and Windows PyTorch wheels to reflect the updated Python version matrix. This change ensures compatibility with Python 3.10 across relevant documentation and CI configurations.
…#2781) Revert to allow proper sequencing of Python 3.10 changes
Bumping rocm-systems submodule 2026-01-05
- Needed to make compiler/amd-staging branch of TheRock buildable with ToT llvm-project.
## Motivation From #2789, we are disabling amdsmi but still getting "sanity check" signal from gfx1151 linux machines ## Technical Details Disabling amdsmi for gfx1151 and re-enabling gfx1151 Linux test machines ## Test Plan Tested via workflow_dispatch since this is tested via test machine, CI is skipped ## Test Result Test works here: https://github.com/ROCm/TheRock/actions/runs/20763397792/job/59624098173 ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests. Closes #2741 Progress on #2789 Next will be enabling gfx1151 machines for all tests
## Motivation From #2614, we are disabling gfx1150 machines. however, `linux-strix-halo-gpu-rocm-1` is nowhere to find and thus cannot debug/fix. however, we have 2 gfx1150 machines that are healthy so we can re-enable this runner ## Technical Details Re-enabling gfx1150 Linux test machines ## Test Plan Testing via CI workflow_dispatch. Since this is tested via test machine, CI is skipped ## Test Result Test works here: https://github.com/ROCm/TheRock/actions/runs/20760306307/job/59615409472 ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests. Closes #2614
This PR enables benchmark testing on Windows runners for `gfx1151`, extending the existing Linux-only benchmark infrastructure to support cross-platform execution. ## Motivation Currently, TheRock benchmark tests only run on Linux runners. To ensure comprehensive performance validation across all supported platforms, we need to enable benchmark testing on Windows as well. This PR extends the benchmark framework to support Windows execution, specifically targeting `gfx1151 `Windows runners. ## Technical Details 1. Workflow Integration - Added `test_windows_benchmarks` job to `.github/workflows/ci_windows.yml` that mirrors the Linux benchmark implementation 2. Cross-Platform CPU Detection - Refactored hardware.py to support both Linux and Windows 3. Windows Encoding Compatibility - Replaced all Unicode characters with ASCII-safe alternatives to fix `UnicodeEncodeError` on Windows (CP1252 encoding) 4. Binary Path Simplification - Removed explicit binary existence check from `test_rocrand_benchmark.py` 5. GPU Family Configuration - Added Windows benchmark runner for `gfx1151` in `amdgpu_family_matrix.py` ## Test Plan - Nightly CI workflow triggers test_windows_benchmarks job for `gfx1151` - Build artifacts are correctly passed to benchmark job - Benchmark job runs on correct Windows GPU runner - All benchmark scripts execute successfully (rocFFT, rocRAND, rocSOLVER, hipBLASLt) - Results are saved locally and uploaded to benchmark database ## Test Result Windows Build and Benchmark Execution verified on Nightly-CI (`gfx1151`). [https://github.com/ROCm/TheRock/actions/runs/20466255467](https://github.com/ROCm/TheRock/actions/runs/20466255467) --------- Signed-off-by: Lenine Ajagappane <[email protected]>
Re-enabling hipdnn tests for gfx950 [Working in workflow_dispatch test](https://github.com/ROCm/TheRock/actions/runs/20763261063/job/59628678556) Closes #2758 Adding `skip-ci` label as tests are proven to work on workflow_dispatch, no need to take resources for this update
Fix RDC runtime library lookup failures and use explicit RPATH settings - therock_subproject.cmake: * Add explicit INSTALL_RPATH_EXECUTABLE_DIR/LIBRARY_DIR for RDC since its INSTALL_DESTINATION is flattened by artifact-flatten * RPATH is now relative to final flattened location (bin/, lib/) - libcap: * Remove redundant SONAME handling in patch_install.sh; patch_linux_so.py already manages SONAME correctly Fixes: ROCM-393
## Motivation From #2683, we now get good signal for gfx120X via [workflow dispatch testing](https://github.com/ROCm/TheRock/actions/runs/20761843018/job/59618697480) ## Technical Details Re-enabling test machine and also providing ROCm recommended docker flags ## Test Plan CI ## Test Result Test works here via workflow dispatch: https://github.com/ROCm/TheRock/actions/runs/20761843018/job/59618697480 Also running tests via CI ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests. Closes #2683
Aotriton is already disabled on linux for those archs, see #2709. Re-enabling can happen when PyTorch has bumped aotriton to include the following PR: ROCm/aotriton#142
Re-enabling tests for `gfx1151` windows [Tests work via `workflow_dispatch`](https://github.com/ROCm/TheRock/actions/runs/20786377310/job/59697319709), no need to re-build everything (thus the `skip-ci` flag) Closes #2798
…2796) This PR is comprised of 3 commits. Please note mpfr is already a PR itself here (#2602), but expat and ncurses depend on the mpfr commit to apply cleanly. So it is included here. This PR gives us a chance to run CI for all of these dependencies together and provides a way to review the expat and ncurses changes as well. If all of these look good together, we could potentially push them as a single merge. [Add MPFR to common sysdeps](3142f10) MPFR is a required dependency for GMP and ROCgdb, so include it along with the other sysdeps. The sources are cached from the official website. We use the 4.2.2 release. We do some light smoke testing to make sure the archive and libraries get built. MPFR is OS-agnostic, so is kept under third-party/sysdeps/common. In the future it will be used for Windows-based builds of the debug-tools. [Add expat to common sysdeps](32e5fc7) Expat is a required dependency for ROCgdb, so include it along with the other sysdeps. The sources are cached from the official website. We use the 2.7.3 release. We do some light smoke testing to make sure the archive and libraries get built. Expat is OS-agnostic, so is kept under third-party/sysdeps/common. In the future it will be used for Windows-based builds of the debug-tools. [Add ncurses to common sysdeps](4e789db) NCurses is a required dependency for ROCgdb, so include it along with the other sysdeps. The sources are cached from the official website. We use the 6.5 release. We do some light smoke testing to make sure the archive and libraries get built. NCurses is OS-agnostic, so is kept under third-party/sysdeps/common. In the future it will be used for Windows-based builds of the debug-tools. --------- Co-authored-by: Stella Laurenzo <[email protected]> Co-authored-by: Claude <[email protected]>
…les (#2776) ## Motivation The process for updating dockerfiles has been tribal knowledge until now, with a few sharp edges: * Each dockerfile has specific constraints that have only been loosely documented * The automated publishing process through github actions requires a sequence of workflow runs and pull requests to land changes * Downstream users have been looking at our published packages and using them in their own workflows (automated or otherwise), without clear guidance on how the packages are intended to be used / supported ## Technical Details I omitted details on how to build and test images locally as I'm currently developing on Windows which does not have good Docker support. I wrote these docs before, which we could adapt here: https://github.com/iree-org/base-docker-images?tab=readme-ov-file#building-locally. Someone with more recent Linux development expertise could help write that part of the docs. ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
## Motivation Since #2419 (merged 1 month ago), we've had zero patches to PyTorch repositories here in TheRock, so this code has been unused. We now use the release branches in the https://github.com/ROCm/pytorch repository exclusively, and landing cherry-picks for issues like ROCm/pytorch#2889 has been proceeding smoothly. I think it's time to shut the door on .patch files for PyTorch, then we can eventually do the same thing for ROCm in general once the https://github.com/ROCm/TheRock/tree/main/patches folder is similarly empty. ### Discussion of counter-arguments While there is the https://github.com/ROCm/pytorch repository, there are not equivalents for other projects and this patch structure allows us to change any submodule of pytorch too. We really need to lean on upstream, or at least _collaborative_ downstream development here though, using regular git features and not patch files managed through bespoke scripts separate from the project code. ## Technical Details The checkout scripts are still useful since they handle reading from the "related commits" file to select a commit to checkout and they also run HIPIFY. The checkout process is: 1. Determine which commit to checkout (e.g. by reading the related commits file from the torch repo) 2. Checkout that commit and tag it 3. **(⚠️ removed here)** Apply "base" patches 4. Run HIPIFY and commit + create a tag 5. **(⚠️ removed here)** Apply "hipify" patches ## Test Plan * Sanity checked checkout scripts locally * Test workflow runs: Workflow | Test details | Run logs -- | -- | -- `build_portable_linux_pytorch_wheels.yml` | Expected to checkout `release/2.9` from ROCm/pytorch then build successfully | https://github.com/ROCm/TheRock/actions/runs/20761710618/job/59618093185 `build_windows_pytorch_wheels.yml` | Expected to checkout `release/2.9` from ROCm/pytorch then build successfully | https://github.com/ROCm/TheRock/actions/runs/20761686755/job/59618014098 `test_pytorch_wheels.yml` | Showing that the error in the Windows run above is fixed by removing the `--no-patch` argument | https://github.com/ROCm/TheRock/actions/runs/20763733809/job/59624927054 `release_windows_pytorch_wheels.yml` | Expected to populate a matrix with the expected Python and PyTorch versions (cancelled before actually running those builds) | https://github.com/ROCm/TheRock/actions/runs/20761830824 `release_portable_linux_pytorch_wheels.yml` | Untested (similar enough to the Windows workflow) | N/A ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests. --------- Co-authored-by: Claude Sonnet 4.5 <[email protected]>
* texinfo-tex (discovered missing while working) * Shared builds of python 3.10-14 so that rocgdb can link against versions suitable for embedding python. Note that we build these python variants from source sufficient for build time use: they do not have all features. Adds an estimated ~250MiB to the container size and is unavoidable. Container build on a regular workstation doesn't take much more time. On the default GHA runner, it adds about 10m to the build time (12m total), which is not consequential. Will unblock #2797 once we roll the hashes to this new version.
Avoids error `ERROR: unexpected error - cannot import name '_DIR_MARK' from 'pathspec.patterns.gitwildmatch'` which comes from dvc depending on a private API that changed.
## Motivation Add answers for common questions that ROCm users may have, including a detailed section specific to gfx1151 (Strix Halo) ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests. --------- Co-authored-by: Jan Stephan <[email protected]> Co-authored-by: Adel Johar <[email protected]>
…4) (#2699) ## Overview This PR updates the following compiler submodules to the Nov-24 staging branch: - `amd-llvm` - `spirv-llvm-translator` - `hipify` This update includes required build system changes to support new LLVM tooling. ## Submodule Updates | Submodule | Commit Hash | |-----------|-------------| | amd-llvm | `08a72fc` | | hipify | `d59429d` | | spirv-llvm-translator | `0eaeb11` | **Commit Range**: [ROCm/llvm-project@8e85e31...08a72fc](ROCm/llvm-project@8e85e31...08a72fc) **Time Period**: October 06, 2025 → November 24, 2025 ## LLVM Version Information - **Base Version**: clang version 22.0.0 - **Major Version Change**: No ## Test Plan - Full TheRock build (Linux & Windows) - Pytorch build - QA nightly validation with ROCm components ## Test Results - Full Linux build - https://github.com/ROCm/TheRock/actions/runs/20477434962 - SUCCESS - Full Windows build - https://github.com/ROCm/TheRock/actions/runs/20477447619 - SUCCESS - Pytorch Linux - https://github.com/ROCm/TheRock/actions/workflows/release_portable_linux_pytorch_wheels.yml?query=branch%3Ausers%2Flajagapp%2Fcompiler-nov24-promo – Build SUCCESS, rocm-sdk sanity test failed which is known issue (#2714) with older TheRock commit. Restarted the build after rebasing the PR Signed-off-by: Lenine Ajagappane <[email protected]>
## Motivation With the Nov24 compiler promotion, OpenMP offload code compilation fails in rocm-example with the following error: ``` clang++: error: unable to execute command: posix_spawn failed: No such file or directory clang++: error: llvm-offload-binary command failed with exit code 1 ``` The root cause is that the llvm-offload-binary tool was not being built because it wasn't included in the required LLVM tools list. This tool is essential for GPU offloading as it bundles device object files into a single binary container. This PR adds LLVM_OFFLOAD_BINARY to the required tools list to ensure it's built and available for OpenMP offload compilation. ## Technical Details **Background**: The llvm-offload-binary tool was introduced in llvm-project [PR#161438](llvm/llvm-project#161438) and is responsible for bundling device object files into a single binary container for GPU offloading workflows. **Related Commit**: Part of the Nov24 compiler promotion effort alongside the LLVM/SPIRV/Hipify submodule updates. We can merge this PR once #2699 merged. ## Test Result - Full TheRock Linux build - https://github.com/ROCm/TheRock/actions/runs/20741883335 - llvm-offload-binary tool successfully built - Verified rocm-example compilation, OpenMP offload compilation errors resolved Signed-off-by: Lenine Ajagappane <[email protected]>
From command: ``` sed -i 's/583d473f263a289222c48d4b493e2956b2354a45796f09dee6f2c8ecd4504ab6/6e8242d347af7e0c43c82d5031a3ac67b669f24898ea8dc2f1d5b7e4798b66bd/g' $(git grep -l 583d473f263a289222c48d4b493e2956b2354a45796f09dee6f2c8ecd4504ab6) ``` Includes a fix for pinning a dvc dep from d04e431 See producing PR #2815
Bump rocm-libraries and composable-kernel 20260107. MIOpen now uses tip of composable kernel.
Adding sharding for `rocprim` after noticing very frequent timeouts Tests working here: [shard 1](https://github.com/ROCm/TheRock/actions/runs/20825758128/job/59826923867), [shard 2](https://github.com/ROCm/TheRock/actions/runs/20825758128/job/59826923797)
Add multi-distro rocm-runtime Dockerfile and installer scripts Introduces rocm_runtime.Dockerfile supporting Ubuntu, AlmaLinux, and Azure Linux via configurable BASE_IMAGE. Includes install_rocm_tarball.sh for downloading ROCm tarballs (nightlies/prereleases/devreleases/stable) and install_rocm_deps.sh for auto-detecting and installing dependencies. - Support for stable releases via repo.amd.com - Standalone usage instructions for environment variables Signed-off-by: Wang, Yanyao <[email protected]> Co-authored-by: Wang, Yanyao <[email protected]>
…TheRock (#2797) Integrate 3 debugging tools into TheRock under the debug-tools logical group: - amd-dbgapi - rocr-debug-agent - rocgdb Building of those tools is controlled by the -DTHEROCK_ENABLE_DEBUG_TOOLS switch. The individual tools can also be included/excluded. At the moment rocgdb builds without python support (rocgdb-pynone). This commit also enables TheRock to detect Python shared library support (libpython) and adds hardlink support to the artifact populator. Co-authored-by: Stella Laurenzo <[email protected]> Co-authored-by: Claude <[email protected]>
When building therock-mpfr, the build fails with: [therock-mpfr] aclocal-1.17: command not found when automake 1.17 is not installed. In our README.md's `Setup - Ubuntu (24.04)` section, we don't require to install it. This occurs because `cmake -E copy_directory` does not preserve file timestamps. When the MPFR source is copied to the build directory, autoconf-generated files (aclocal.m4, Makefile.in) end up with newer timestamps than their source files (configure.ac, etc.), causing the autotools build system to attempt regeneration. This regeneration requires the exact automake version (1.17) that was used to generate the original files. Fix: Touch the critical autoconf-generated files after copying to ensure their timestamps are newer than the source files. This prevents the autotools build system from trying to regenerate them. This approach matches the existing workaround used in the GMP build (third-party/sysdeps/common/gmp/CMakeLists.txt). ## Test Plan No aclocal-1.17 installed locally ``` rm -rf build/third-party/sysdeps/linux/mpfr ninja -C build therock-mpfr ``` ## Test Result Succesful ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
…mmends instead of requires (#2800) Fixing the libstdc++ error in SLES OS. Removing the dependency libstdc++ dependency in rpm packages Also added the hipify package to amdrocm-core ## Motivation SLES packages installation is failing with error Problem: 1: nothing provides 'libstdc++' needed by the to be installed amdrocm-llvm7.11 ## Technical Details Removing libstdc++ as a mandatory requirement
## Motivation For some time, ASAN builds have been failing due to variety of `build` and `Test Packaging` errors. With these variety of fixes and conditionals, the CI ASAN pipeline will now pass properly Closes #1990 Related to #2632 ## Technical Details - After a RCCL bump, the timeout extends up to 15 hours based on tests, so extending timeout - Most of the third-party libraries did not support ASAN, causing test failures - rocprofiler-systems doesn't support clang, thus adding an conditional - rocprofiler-sdk requires amd-hip as compiler - fftw3 requires a new cmake arg ## Test Plan Testing via CI ASAN ## Test Result Build passes here: https://github.com/ROCm/TheRock/actions/runs/20722419413/job/59489435646 Build without RCCL running here: https://github.com/ROCm/TheRock/actions/runs/20827525680/job/59832753950 ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
7e9ba03 to
3077ce8
Compare
Member
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Motivation
CK should have support for RDNA 3/4 and should be able to use the RDNA 3 for 3.5 support so this is a PR to enable all these ASICs so we can figure out if there are any remaining issues.
The one known issue is there seems to be a LLVM compiler issue with the Windows gfx110x support so we expect that to still fail right now.
Technical Details
For now just added potentially supported gfx ids that CK should be working with.
Test Plan
We'll need to generate builds for all these ASICs and both show that CK for MIOpen is compiling, on Linux and Windows, and then test the performance and results across a wide range of shapes on each ASIC.
Test Result
Submission Checklist