Skip to content

Commit 7ee0ce2

Browse files
authored
[AICOMRCCL-697] Add --enable-mpi-tests and --cmake-options to install.sh (#3862)
## Motivation - Add --enable-mpi-tests flag (requires --debug; MPI tests reference internal RCCL symbols hidden in release builds) - Add --cmake-options pass-through for arbitrary CMake -D options - Update docs/install/installation.rst with new options and environment variable documentation (ONLY_FUNCS) ## Technical Details <!-- Explain the changes along with any relevant GitHub links. --> ## JIRA ID AICOMRCCL-697 ## Test Plan Verified that --enable-mpi-tests correctly gates on --debug and passes -DENABLE_MPI_TESTS=ON to CMake. Verified --cmake-options appends options to the CMake invocation. ## Test Result <!-- Briefly summarize test outcomes. --> ## Submission Checklist - [ ] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
1 parent dbf1223 commit 7ee0ce2

File tree

3 files changed

+87
-8
lines changed

3 files changed

+87
-8
lines changed

projects/rccl/docs/install/installation.rst

Lines changed: 38 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -52,31 +52,65 @@ The RCCL build and installation helper script options are as follows:
5252
.. code-block:: shell
5353
5454
--address-sanitizer Build with address sanitizer enabled
55+
-c|--enable-code-coverage Enable code coverage
5556
-d|--dependencies Install RCCL dependencies
5657
--debug Build debug library
58+
--debug-fast Build debug library with lto optimization disabled (fast build times)
5759
--enable_backtrace Build with custom backtrace support
5860
--disable-colltrace Build without collective trace
59-
--disable-msccl-kernel Build without MSCCL kernels
61+
--enable-msccl-kernel Build with MSCCL kernels
62+
--dump-asm Disassemble code and dump assembly with inline code
6063
--enable-mscclpp Build with MSCCL++ support
64+
--enable-mscclpp-clip Build MSCCL++ with clip wrapper on bfloat16 and half addition routines
65+
--disable-roctx Build without ROCTX logging
6166
-f|--fast Quick-build RCCL (local gpu arch only, no backtrace, and collective trace support)
6267
-h|--help Prints this help message
6368
-i|--install Install RCCL library (see --prefix argument below)
64-
-j|--jobs Specify how many parallel compilation jobs to run ($nproc by default)
69+
-j|--jobs Specify how many parallel compilation jobs to run (128 by default)
70+
--kernel-resource-use Dump GPU kernel resource usage (e.g., VGPRs, scratch, spill) at link stage
6571
-l|--local_gpu_only Only compile for local GPU architecture
6672
--amdgpu_targets Only compile for specified GPU architecture(s). For multiple targets, separate by ';' (builds for all supported GPU architectures by default)
6773
--no_clean Don't delete files if they already exist
6874
--npkit-enable Compile with npkit enabled
75+
--log-trace Build with log trace enabled (i.e. NCCL_DEBUG=TRACE)
76+
--enable-mpi-tests Enable MPI-based tests (requires --debug and MPI installation; set MPI_PATH if not in /opt/ompi)
6977
--openmp-test-enable Enable OpenMP in rccl unit tests
70-
--roctx-enable Compile with roctx enabled (example usage: rocprof --roctx-trace ./rccl-program)
7178
-p|--package_build Build RCCL package
7279
--prefix Specify custom directory to install RCCL to (default: `/opt/rocm`)
73-
--rm-legacy-include-dir Remove legacy include dir Packaging added for file/folder reorg backward compatibility
7480
--run_tests_all Run all rccl unit tests (must be built already)
7581
-r|--run_tests_quick Run small subset of rccl unit tests (must be built already)
7682
--static Build RCCL as a static library instead of shared library
7783
-t|--tests_build Build rccl unit tests, but do not run
7884
--time-trace Plot the build time of RCCL (requires `ninja-build` package installed on the system)
7985
--verbose Show compile commands
86+
--force-reduce-pipeline Force reduce_copy sw pipeline to be used for every reduce-based collectives and datatypes
87+
--generate-sym-kernels Generate symmetric memory kernels
88+
-q|--quiet-warnings Suppress majority of compiler warnings (not recommended)
89+
--rocshmem Build with rocSHMEM support
90+
--cmake-options Pass additional CMake options (e.g. --cmake-options "-DFOO=BAR -DBAZ=ON")
91+
92+
Available RCCL-specific CMake options for --cmake-options:
93+
-DBUILD_EXT_EXAMPLES=ON Build ext-{net,tuner,profiler} example plugins (default: OFF)
94+
-DENABLE_MSCCLPP_EXECUTOR=ON Enable MSCCL++ Executor (default: OFF)
95+
-DENABLE_MSCCLPP_FORMAT_CHECKS=ON Enable formatting checks in MSCCL++ (default: OFF)
96+
-DMSCCLPP_APPLY_PATCHES=OFF Disable source code patches for MSCCL++ (default: ON)
97+
-DENABLE_IFC=ON Enable indirect function call (default: OFF)
98+
-DPROFILE=ON Enable profiling (default: OFF)
99+
-DTIMETRACE=ON Enable time-trace during compilation (default: OFF)
100+
-DFAULT_INJECTION=OFF Disable fault injection (default: ON)
101+
-DDWORDX4_INTRINSICS=OFF Disable dwordx4 intrinsics (default: ON)
102+
-DENABLE_COMPRESS=OFF Disable GPU code compression (default: ON)
103+
-DRCCL_ROCPROFILER_REGISTER=OFF Disable rocprofiler-register support (default: ON)
104+
105+
Environment variables:
106+
ONLY_FUNCS Build only specified collective functions (debug builds only).
107+
Restricts GPU kernel generation to the listed collectives, significantly
108+
reducing build time during development. Use '|' to separate multiple functions.
109+
Example: ONLY_FUNCS="AllReduce|SendRecv" ./install.sh --debug -t
110+
Available: AllReduce, Broadcast, Reduce, AllGather, ReduceScatter,
111+
AlltoAllPivot, SendRecv, AlltoAllGda, AlltoAllvGda
112+
Advanced: Specify algo, protocol, redop, and type per collective.
113+
ONLY_FUNCS="AllReduce RING SIMPLE Sum f32|SendRecv"
80114
81115
.. tip::
82116

projects/rccl/install.sh

Lines changed: 44 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ enable_mscclpp_clip=false
3333
num_parallel_jobs=$(nproc)
3434
npkit_enabled=false
3535
openmp_test_enabled=false
36+
enable_mpi_tests=false
3637
kernel_resource_use=false
3738
roctx_enabled=true
3839
run_tests=false
@@ -43,6 +44,7 @@ generate_sym_kernels=false
4344
warp_speed_enabled=true # note that this flag will be overridden to false for non MI350/MI300 platforms
4445
quiet_warnings=false
4546
build_rocshmem_support=false
47+
custom_cmake_options=""
4648

4749
# #################################################
4850
# helper functions
@@ -73,6 +75,7 @@ function display_help()
7375
echo " --no_clean Don't delete files if they already exist"
7476
echo " --npkit-enable Compile with npkit enabled"
7577
echo " --log-trace Build with log trace enabled (i.e. NCCL_DEBUG=TRACE)"
78+
echo " --enable-mpi-tests Enable MPI-based tests (requires --debug and MPI installation; set MPI_PATH if not in /opt/ompi)"
7679
echo " --openmp-test-enable Enable OpenMP in rccl unit tests"
7780
echo " -p|--package_build Build RCCL package"
7881
echo " --prefix Specify custom directory to install RCCL to (default: \`/opt/rocm\`)"
@@ -86,6 +89,30 @@ function display_help()
8689
echo " --generate-sym-kernels Generate symmetric memory kernels"
8790
echo " -q|--quiet-warnings Suppress majority of compiler warnings (not recommended)"
8891
echo " --rocshmem Build with rocSHMEM support"
92+
echo " --cmake-options Pass additional CMake options (e.g. --cmake-options \"-DFOO=BAR -DBAZ=ON\")"
93+
echo ""
94+
echo " Available RCCL-specific CMake options for --cmake-options:"
95+
echo " -DBUILD_EXT_EXAMPLES=ON Build ext-{net,tuner,profiler} example plugins (default: OFF)"
96+
echo " -DENABLE_MSCCLPP_EXECUTOR=ON Enable MSCCL++ Executor (default: OFF)"
97+
echo " -DENABLE_MSCCLPP_FORMAT_CHECKS=ON Enable formatting checks in MSCCL++ (default: OFF)"
98+
echo " -DMSCCLPP_APPLY_PATCHES=OFF Disable source code patches for MSCCL++ (default: ON)"
99+
echo " -DENABLE_IFC=ON Enable indirect function call (default: OFF)"
100+
echo " -DPROFILE=ON Enable profiling (default: OFF)"
101+
echo " -DTIMETRACE=ON Enable time-trace during compilation (default: OFF)"
102+
echo " -DFAULT_INJECTION=OFF Disable fault injection (default: ON)"
103+
echo " -DDWORDX4_INTRINSICS=OFF Disable dwordx4 intrinsics (default: ON)"
104+
echo " -DENABLE_COMPRESS=OFF Disable GPU code compression (default: ON)"
105+
echo " -DRCCL_ROCPROFILER_REGISTER=OFF Disable rocprofiler-register support (default: ON)"
106+
echo ""
107+
echo " Environment variables:"
108+
echo " ONLY_FUNCS Build only specified collective functions (debug builds only)."
109+
echo " Restricts GPU kernel generation to the listed collectives, significantly"
110+
echo " reducing build time during development. Use '|' to separate multiple functions."
111+
echo " Example: ONLY_FUNCS=\"AllReduce|SendRecv\" ./install.sh --debug -t"
112+
echo " Available: AllReduce, Broadcast, Reduce, AllGather, ReduceScatter,"
113+
echo " AlltoAllPivot, SendRecv, AlltoAllGda, AlltoAllvGda"
114+
echo " Advanced: Specify algo, protocol, redop, and type per collective."
115+
echo " ONLY_FUNCS=\"AllReduce RING SIMPLE Sum f32|SendRecv\""
89116
}
90117

91118
# #################################################
@@ -95,7 +122,7 @@ function display_help()
95122
# check if we have a modern version of getopt that can handle whitespace and long parameters
96123
getopt -T
97124
if [[ "$?" -eq 4 ]]; then
98-
GETOPT_PARSE=$(getopt --name "${0}" --options cdfhij:lprtq --longoptions address-sanitizer,dependencies,debug,debug-fast,dump-asm,enable-code-coverage,enable_backtrace,disable-colltrace,disable-msccl-kernel,enable-mscclpp,fast,help,install,jobs:,kernel-resource-use,local_gpu_only,amdgpu_targets:,no_clean,npkit-enable,log-trace,openmp-test-enable,roctx-enable,package_build,prefix:,rm-legacy-include-dir,run_tests_all,run_tests_quick,static,tests_build,time-trace,force-reduce-pipeline,generate-sym-kernels,quiet-warnings,disable-warp-speed,verbose,rocshmem -- "$@")
125+
GETOPT_PARSE=$(getopt --name "${0}" --options cdfhij:lprtq --longoptions address-sanitizer,dependencies,debug,debug-fast,dump-asm,enable-code-coverage,enable_backtrace,disable-colltrace,disable-msccl-kernel,enable-mscclpp,enable-mpi-tests,fast,help,install,jobs:,kernel-resource-use,local_gpu_only,amdgpu_targets:,no_clean,npkit-enable,log-trace,openmp-test-enable,roctx-enable,package_build,prefix:,rm-legacy-include-dir,run_tests_all,run_tests_quick,static,tests_build,time-trace,force-reduce-pipeline,generate-sym-kernels,quiet-warnings,disable-warp-speed,verbose,rocshmem,cmake-options: -- "$@")
99126
else
100127
echo "Need a new version of getopt"
101128
exit 1
@@ -121,6 +148,7 @@ while true; do
121148
--dump-asm) dump_asm=true; shift ;;
122149
--enable-mscclpp) mscclpp_enabled=true; shift ;;
123150
--enable-mscclpp-clip) enable_mscclpp_clip=true; shift ;;
151+
--enable-mpi-tests) enable_mpi_tests=true; shift ;;
124152
--disable-roctx) roctx_enabled=false; shift ;;
125153
-f | --fast) build_local_gpu_only=true; collective_trace=false; msccl_kernel_enabled=false; shift ;;
126154
-h | --help) display_help; exit 0 ;;
@@ -146,6 +174,7 @@ while true; do
146174
--disable-warp-speed) warp_speed_enabled=false; shift ;;
147175
-q | --quiet-warnings) quiet_warnings=true; shift ;;
148176
--rocshmem) build_rocshmem_support=true; shift ;;
177+
--cmake-options) custom_cmake_options=${2}; shift 2 ;;
149178
--) shift ; break ;;
150179
*) echo "Unexpected command line parameter received; aborting";
151180
exit 1
@@ -313,6 +342,15 @@ if [[ "${openmp_test_enabled}" == true ]]; then
313342
cmake_common_options="${cmake_common_options} -DOPENMP_TESTS_ENABLED=ON"
314343
fi
315344

345+
# Enable MPI tests (debug only)
346+
if [[ "${enable_mpi_tests}" == true ]]; then
347+
if [[ "${build_release}" == true ]]; then
348+
echo "ERROR: --enable-mpi-tests requires --debug. Please re-run with --debug."
349+
exit 1
350+
fi
351+
cmake_common_options="${cmake_common_options} -DENABLE_MPI_TESTS=ON"
352+
fi
353+
316354
# Force Reduce pipeline
317355
if [[ "${force_reduce_pipeline}" == true ]]; then
318356
cmake_common_options="${cmake_common_options} -DFORCE_REDUCE_PIPELINING=ON"
@@ -373,6 +411,11 @@ fi
373411
# Add build directory to RPATH for packaging dependency resolution
374412
cmake_common_options="${cmake_common_options} -DCMAKE_EXE_LINKER_FLAGS=\"-Wl,-rpath,${PWD}\""
375413

414+
# Append any custom CMake options passed via --cmake-options
415+
if [[ ! -z "${custom_cmake_options}" ]]; then
416+
cmake_common_options="${cmake_common_options} ${custom_cmake_options}"
417+
fi
418+
376419
# Initiate RCCL CMake
377420
# Passing ONLY_FUNCS separately (not as part of ${cmake_common_options}) as
378421
# ${ONLY_FUNCS} is a debug-only feature

projects/rccl/test/CMakeLists.txt

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,9 +40,11 @@ if(BUILD_TESTS)
4040

4141
# MPI configuration
4242
if(ENABLE_MPI_TESTS)
43-
# Set default MPI path, allow user to override
44-
if(NOT DEFINED MPI_PATH)
45-
set(MPI_PATH "/opt/ompi" CACHE PATH "Path to MPI installation")
43+
# Set MPI path: 1) environment variable (always wins), 2) CMake variable, 3) default
44+
if(DEFINED ENV{MPI_PATH})
45+
set(MPI_PATH "$ENV{MPI_PATH}" CACHE PATH "Path to MPI installation" FORCE)
46+
elseif(NOT DEFINED MPI_PATH)
47+
set(MPI_PATH "/opt/ompi" CACHE PATH "Path to MPI installation")
4648
endif()
4749

4850
# Verify MPI path exists

0 commit comments

Comments
 (0)