Skip to content

Conversation

@rkamd
Copy link
Contributor

@rkamd rkamd commented Aug 29, 2024

resolves # SWDEV-481179

Add a note in rocBLAS documentation about thread oversubscription in AOCL

References:
flame/blis#588
flame/blis#604
flame/blis#630

* Add a note about thread oversubscription in AOCL

* Apply suggestions from code review

Co-authored-by: Jeffrey Novotny <[email protected]>
@rkamd rkamd added the ci:docs-only Skip most non-doc CI for this PR label Aug 29, 2024
Copy link
Contributor

@TorreZuk TorreZuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor clarification requested

These three clients can be built by following the instructions in the Building and Installing section of the User Guide. After building the rocBLAS clients, they can be found in the directory ``rocBLAS/build/release/clients/staging``.

.. note::
The ``rocblas-bench`` and ``rocblas-test`` executables use AMD's ILP64 version of AOCL-BLAS 4.2 as the host reference BLAS to verify correctness. However, there is a known issue with AOCL-BLAS that can cause these executables to hang. This problem can arise because the AOCL-BLAS library launches multiple threads to perform computations. If the number of threads matches the total number of CPU threads, it can lead to thread oversubscription, causing the program to hang.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry missed this earlier "total number of CPU threads" -> "total number of CPU logical cores"

@rkamd rkamd merged commit d27f7bd into release-staging/rocm-rel-6.3 Aug 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci:docs-only Skip most non-doc CI for this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants