Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion rocm-libraries
Submodule rocm-libraries updated 70 files
+1 −0 .github/workflows/therock-ci-linux.yml
+1 −0 .github/workflows/therock-ci-windows.yml
+1 −0 projects/composablekernel/CHANGELOG.md
+29 −12 projects/composablekernel/example/ck_tile/01_fmha/codegen/ops/fmha_batch_prefill.py
+1 −1 projects/composablekernel/example/ck_tile/01_fmha/codegen/ops/fmha_fwd.py
+9 −0 projects/composablekernel/example/ck_tile/01_fmha/fmha_fwd.hpp
+117 −70 projects/composablekernel/include/ck_tile/ops/fmha/kernel/fmha_batch_prefill_kernel.hpp
+18 −7 projects/composablekernel/include/ck_tile/ops/fmha/kernel/fmha_fwd_kernel.hpp
+45 −11 projects/composablekernel/include/ck_tile/ops/gemm/kernel/universal_gemm_kernel.hpp
+9 −0 ...posablekernel/include/ck_tile/ops/grouped_convolution/kernel/grouped_convolution_backward_weight_kernel.hpp
+26 −2 projects/composablekernel/test/ck_tile/grouped_conv/test_ck_tile_grouped_conv_bwd_weight.cpp
+6,584 −16,537 ...blaslt/src/Tensile/Logic/asm_full/gfx950/Equality/gfx950_Cijk_Alik_Bljk_BBS_BH_BiasSB_HAS_SAV_UserArgs.yaml
+5,884 −1,438 ...t/src/Tensile/Logic/asm_full/gfx950/Equality/gfx950_Cijk_Alik_Bljk_F8BS_BH_BiasSB_HAS_SAB_SAV_UserArgs.yaml
+26 −10 projects/hipdnn/tests/backend/IntegrationBackendDescriptor.cpp
+37 −6 projects/hipdnn/tests/backend/IntegrationHandleApi.cpp
+3 −9 projects/miopen/docs/install/build-source.rst
+4 −11 projects/miopen/src/db.cpp
+6 −178 projects/miopen/src/include/miopen/lock_file.hpp
+4 −6 projects/miopen/src/lock_file.cpp
+20 −12 projects/miopen/src/ramdb.cpp
+3 −3 projects/miopen/test/gtest/kernel_tuning_net.cpp
+0 −119 projects/miopen/test/gtest/unit_lock_file.cpp
+6 −1 projects/rocblas/CHANGELOG.md
+2 −2 projects/rocblas/clients/CMakeLists.txt
+0 −51 projects/rocblas/clients/cmake/FindROCmSMI.cmake
+1 −0 projects/rocblas/clients/common/CMakeLists.txt
+33 −0 projects/rocblas/clients/common/blas_ex/common_herk_ex.cpp
+43 −0 projects/rocblas/clients/common/blas_ex/common_herk_ex.hpp
+50 −0 projects/rocblas/clients/common/cblas_interface.cpp
+1 −0 projects/rocblas/clients/gtest/CMakeLists.txt
+1 −0 projects/rocblas/clients/gtest/blas3_gtest.yaml
+199 −0 projects/rocblas/clients/gtest/blas_ex/herk_ex_gtest.cpp
+1 −1 projects/rocblas/clients/gtest/get_solutions_gtest.yaml
+151 −0 projects/rocblas/clients/gtest/herk_ex_gtest.yaml
+517 −0 projects/rocblas/clients/include/blas_ex/testing_herk_ex.hpp
+13 −0 projects/rocblas/clients/include/cblas_interface.hpp
+9 −0 projects/rocblas/clients/include/rocblas_common.yaml
+29 −0 projects/rocblas/clients/include/rocblas_fortran.f90
+16 −0 projects/rocblas/clients/include/rocblas_fortran.h.in
+2 −0 projects/rocblas/clients/include/rocblas_no_fortran.hpp
+21 −0 projects/rocblas/clients/include/rocblas_smoke.yaml
+30 −0 projects/rocblas/clients/include/type_dispatch.hpp
+5 −0 projects/rocblas/docs/reference/extension.rst
+104 −0 projects/rocblas/library/include/internal/rocblas-functions.h
+27 −0 projects/rocblas/library/include/rocblas_module.f90
+2 −0 projects/rocblas/library/src/CMakeLists.txt
+2 −0 projects/rocblas/library/src/blas3/rocblas_syrk_herk_kernels.cpp
+26 −0 projects/rocblas/library/src/blas_ex/rocblas_herk_ex.cpp
+112 −0 projects/rocblas/library/src/blas_ex/rocblas_herk_ex.hpp
+238 −0 projects/rocblas/library/src/blas_ex/rocblas_herk_ex_imp.hpp
+563 −0 projects/rocblas/library/src/blas_ex/rocblas_herk_ex_kernels.cpp
+25 −400 projects/rocblas/library/src/blas_ex/rocblas_syrk_ex_kernels.cpp
+399 −0 projects/rocblas/library/src/blas_ex/rocblas_syrk_herk_ex_kernels.hpp
+0 −1 projects/rocrand/test/internal/test_rocrand_discrete.cpp
+27 −29 projects/rocrand/test/test_common.hpp
+0 −12 shared/rocroller/lib/include/rocRoller/CodeGen/LoadStoreTileGenerator.hpp
+1 −0 shared/rocroller/lib/include/rocRoller/KernelGraph/ControlGraph/ControlFlowRWTracer.hpp
+1 −1 shared/rocroller/lib/include/rocRoller/KernelGraph/Transforms/AssignComputeIndex.hpp
+21 −142 shared/rocroller/lib/source/CodeGen/LoadStoreTileGenerator.cpp
+6 −3 shared/rocroller/lib/source/CodeGen/LowerFromKernelGraph.cpp
+16 −83 shared/rocroller/lib/source/KernelGraph/ControlFlowArgumentTracer.cpp
+13 −0 shared/rocroller/lib/source/KernelGraph/ControlFlowRWTracer.cpp
+9 −0 shared/rocroller/lib/source/KernelGraph/Transformations/AddStreamK.cpp
+80 −6 shared/rocroller/lib/source/KernelGraph/Transformations/AssignComputeIndex.cpp
+4 −8 shared/rocroller/lib/source/KernelGraph/Transformations/CleanArguments.cpp
+1 −18 shared/rocroller/lib/source/KernelGraph/Transformations/InlineIncrements.cpp
+3 −33 shared/rocroller/test/catch/AssignComputeIndexTest.cpp
+2 −2 shared/rocroller/test/unit/KernelGraphTest.cpp
+15 −6 shared/rocroller/test/unit/LDSTileCopyTest.cpp
+16 −6 shared/rocroller/test/unit/PermLanesTest.cpp
Loading