Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
131 commits
Select commit Hold shift + click to select a range
9fa4846
doc
Binyang2014 Oct 29, 2025
61ab551
doc
Binyang2014 Oct 29, 2025
e5f7a2b
revise
Binyang2014 Oct 29, 2025
67e6bcf
WIP
Binyang2014 Oct 29, 2025
7dca157
WIP
Binyang2014 Oct 29, 2025
62db986
WIP
Binyang2014 Oct 29, 2025
f24b8a6
Merge branch 'main' into binyli/torch-integration
Binyang2014 Oct 29, 2025
1ba1172
WIP
Binyang2014 Oct 29, 2025
0a5653b
WIP
Binyang2014 Oct 30, 2025
e77635f
WIP
Binyang2014 Oct 31, 2025
262485f
WIP
Binyang2014 Nov 4, 2025
27bfd7a
update
Binyang2014 Nov 5, 2025
15d2a14
WIP
Binyang2014 Nov 6, 2025
f254834
Merge branch 'main' into binyli/torch-integration
Binyang2014 Nov 7, 2025
21903e4
refactor
Binyang2014 Nov 7, 2025
1fbec20
compile pass
Binyang2014 Nov 7, 2025
0ebf12f
update
Binyang2014 Nov 8, 2025
8ba0730
WIP
Binyang2014 Nov 8, 2025
346cdbe
WIP
Binyang2014 Nov 11, 2025
2494ce6
Refactor
Binyang2014 Nov 11, 2025
fd6b5e9
fix
Binyang2014 Nov 11, 2025
883f9ef
WIP
Binyang2014 Nov 11, 2025
968f0a9
WIP
Binyang2014 Nov 11, 2025
b0dcfeb
WIP
Binyang2014 Nov 11, 2025
270889d
WIP
Binyang2014 Nov 11, 2025
8d2eaeb
WIP
Binyang2014 Nov 11, 2025
54ac481
WIP
Binyang2014 Nov 12, 2025
c9e8d17
fix
Binyang2014 Nov 12, 2025
033d862
fix perf
Binyang2014 Nov 12, 2025
b3c2935
update python binding
Binyang2014 Nov 12, 2025
01c6c90
temp
Binyang2014 Nov 12, 2025
6097bb6
WIP
Binyang2014 Nov 13, 2025
366fef7
WIP
Binyang2014 Nov 13, 2025
639cd43
WIP
Binyang2014 Nov 13, 2025
147b1f3
WIP
Binyang2014 Nov 14, 2025
2a568d7
WIP
Binyang2014 Nov 14, 2025
a0196c9
WIP
Binyang2014 Nov 14, 2025
a892c45
WIP
Binyang2014 Nov 15, 2025
e3299d5
merge main
Binyang2014 Nov 17, 2025
9e97afb
WIP
Binyang2014 Nov 18, 2025
8ccaf09
WIP
Binyang2014 Nov 18, 2025
3ff4441
minor fix
Binyang2014 Nov 18, 2025
975e86b
lint
Binyang2014 Nov 19, 2025
d495a00
WIP
Binyang2014 Nov 19, 2025
2c7004a
WIP
Binyang2014 Nov 19, 2025
10fe852
WIP
Binyang2014 Nov 24, 2025
b0709bd
demo
Binyang2014 Nov 24, 2025
3761a10
update
Binyang2014 Nov 25, 2025
cb5184f
some fix
Binyang2014 Nov 25, 2025
98dec89
WIP
Binyang2014 Nov 25, 2025
13be162
WIP
Binyang2014 Nov 25, 2025
9a7d518
WIP
Binyang2014 Nov 25, 2025
4bd1633
WIP
Binyang2014 Nov 25, 2025
aa0cee3
WIP
Binyang2014 Nov 25, 2025
b683c91
fix build issue
Binyang2014 Nov 25, 2025
7a1ecaa
make nvls work
Binyang2014 Nov 25, 2025
42f0eae
for nvls non-zero copy
Binyang2014 Nov 26, 2025
6606262
all nvls algo work
Binyang2014 Nov 26, 2025
a6a0c35
WIP
Binyang2014 Nov 26, 2025
75a7eb1
WIP
Binyang2014 Nov 26, 2025
c9eed8d
WIP
Binyang2014 Nov 26, 2025
8d3c5d3
WIP
Binyang2014 Nov 27, 2025
e2a5d69
WIP
Binyang2014 Nov 27, 2025
cef1f89
update
Binyang2014 Nov 30, 2025
d103a5e
WIP
Binyang2014 Dec 1, 2025
808e6af
WIP
Binyang2014 Dec 1, 2025
9d99b14
WIP
Binyang2014 Dec 1, 2025
87aeda7
WIP
Binyang2014 Dec 1, 2025
e2b5eda
lint
Binyang2014 Dec 1, 2025
c44abb8
fix for rocm
Binyang2014 Dec 1, 2025
1a616a2
fix for cuda11
Binyang2014 Dec 1, 2025
3fdf4cc
bug fix
Binyang2014 Dec 2, 2025
53bb832
WIP
Binyang2014 Dec 2, 2025
20d6514
crash fix
Binyang2014 Dec 2, 2025
75f6ec0
WIP
Binyang2014 Dec 2, 2025
8ace561
make it work
Binyang2014 Dec 2, 2025
cc18f58
WIP
Binyang2014 Dec 2, 2025
919ba6a
Fix
Binyang2014 Dec 3, 2025
da07df6
Merge branch 'main' into binyli/torch-integration
Binyang2014 Dec 3, 2025
d1a74ce
fix doc build
Binyang2014 Dec 3, 2025
115fded
update doc
Binyang2014 Dec 3, 2025
d2de261
Update examples/torch-integration/customized_comm_with_default_algo.py
Binyang2014 Dec 3, 2025
d1f94c1
WIP
Binyang2014 Dec 4, 2025
f28c085
update
Binyang2014 Dec 5, 2025
819c6b8
fix
Binyang2014 Dec 5, 2025
7b1a500
fix for mi300x
Binyang2014 Dec 5, 2025
e818db5
update doc string
Binyang2014 Dec 6, 2025
38956fb
add pybind11
Binyang2014 Dec 6, 2025
a6ac934
lint
Binyang2014 Dec 6, 2025
682ad12
fix ut
Binyang2014 Dec 7, 2025
5266962
merge main
Binyang2014 Dec 18, 2025
1b6010e
fix for ut
Binyang2014 Dec 18, 2025
e13b9c2
fix bug
Binyang2014 Dec 19, 2025
52d52b1
merge main
Binyang2014 Dec 19, 2025
b4dbc0b
fix for rocm
Binyang2014 Dec 19, 2025
07fa613
Merge branch 'main' into binyli/torch-integration
Binyang2014 Dec 22, 2025
9f5168d
update for rocm
Binyang2014 Dec 22, 2025
8b83ff3
fix for npkit
Binyang2014 Dec 22, 2025
f4a96da
WIP
Binyang2014 Jan 11, 2026
75e13d8
fix
Binyang2014 Jan 12, 2026
6b38f0b
merge main
Binyang2014 Jan 12, 2026
086fd26
reorgnize the code
Binyang2014 Jan 12, 2026
c1eb114
update
Binyang2014 Jan 12, 2026
ab3965d
fix
Binyang2014 Jan 12, 2026
a767d41
WIP
Binyang2014 Jan 12, 2026
4b4ef43
make nccl-test run
Binyang2014 Jan 12, 2026
347d37e
update
Binyang2014 Jan 12, 2026
d2e3ab1
WIP
Binyang2014 Jan 14, 2026
09b4285
WIP
Binyang2014 Jan 14, 2026
3413209
WIP
Binyang2014 Jan 14, 2026
ff751b3
update
Binyang2014 Jan 14, 2026
1e73a5a
Merge branch 'main' into binyli/refactor
Binyang2014 Jan 14, 2026
5b54743
WIP
Binyang2014 Jan 14, 2026
5a1c058
make example work
Binyang2014 Jan 14, 2026
f91103b
fix npkit build
Binyang2014 Jan 15, 2026
361ae4e
link
Binyang2014 Jan 15, 2026
296d85e
merge main
Binyang2014 Jan 15, 2026
b386a61
fix
Binyang2014 Jan 15, 2026
25b4e66
Address code review feedback: license headers, naming conventions, an…
Copilot Jan 15, 2026
c1db742
fix ci issue
Binyang2014 Jan 15, 2026
298e3b0
lint fix
Binyang2014 Jan 15, 2026
a0fe68e
fix build for fp8
Binyang2014 Jan 15, 2026
d21decd
fix
Binyang2014 Jan 15, 2026
5d11bc8
fix for tests
Binyang2014 Jan 15, 2026
0dcdc04
fix
seagater Jan 15, 2026
85299c7
fix ut
Binyang2014 Jan 16, 2026
bdabb12
fix for nccl-test
Binyang2014 Jan 16, 2026
0e03bcc
Merge branch 'main' into binyli/torch-integration
Binyang2014 Jan 16, 2026
5d66c38
move context structure to internal header
Binyang2014 Jan 17, 2026
431379a
lint
Binyang2014 Jan 18, 2026
d9f3918
lint
Binyang2014 Jan 18, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .azure-pipelines/integration-test-rocm.yml
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ jobs:
script: |
set -e
export PATH=/usr/local/mpi/bin:$PATH
sudo mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN -x LD_PRELOAD="$(pwd)/build/apps/nccl/libmscclpp_nccl.so" \
sudo mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN -x LD_PRELOAD="$(pwd)/build/lib/libmscclpp_nccl.so" \
-x ALLREDUCE_SMALL_MSG_BOUNDARY=32K -x ALLREDUCE_LARGE_MSG_BOUNDARY=1M ./rccl-tests/build/all_reduce_perf -b 1K -e 1G -f 2 -d half -G 20 -w 10 -n 100
workingDirectory: '$(System.DefaultWorkingDirectory)'

Expand All @@ -106,7 +106,7 @@ jobs:
script: |
set -e
export PATH=/usr/local/mpi/bin:$PATH
sudo mpirun -np 8 --bind-to numa --allow-run-as-root -x LD_PRELOAD=$(pwd)/build/apps/nccl/libmscclpp_nccl.so -x NCCL_DEBUG=WARN \
sudo mpirun -np 8 --bind-to numa --allow-run-as-root -x LD_PRELOAD=$(pwd)/build/lib/libmscclpp_nccl.so -x NCCL_DEBUG=WARN \
-x ALLREDUCEPKT_IP_JSON_FILE=./msccl-users/execution-files/allreduce_mi300_packet.json \
-x ALLREDUCE_IP_JSON_FILE=./msccl-users/execution-files/allreduce_mi300_sm_mscclpp.json \
-x ALLREDUCE_SMALL_MSG_BOUNDARY=32K -x ALLREDUCE_LARGE_MSG_BOUNDARY=1M ./rccl-tests/build/all_reduce_perf \
Expand Down
44 changes: 22 additions & 22 deletions .azure-pipelines/templates/integration-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -74,13 +74,13 @@ steps:
parallel-ssh -o . -t 0 -h ${HOSTFILE} -x "-i ${KeyFilePath}" \
-O $SSH_OPTION 'sudo docker exec -t mscclpp-test bash -c " \
export PATH=/usr/local/mpi/bin:\$PATH; \
export LD_LIBRARY_PATH=/root/mscclpp/build:\$LD_LIBRARY_PATH; \
export LD_LIBRARY_PATH=/root/mscclpp/build/lib:\$LD_LIBRARY_PATH; \
cd /root/mscclpp; \
set -e; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/test/mscclpp-test/allgather_test_perf -b 1K -e 1G -f 2 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/test/mscclpp-test/allgather_test_perf -b 1K -e 1G -f 2 -k 1 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/test/mscclpp-test/allgather_test_perf -b 1K -e 1G -f 2 -k 2 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/test/mscclpp-test/allgather_test_perf -b 1K -e 1G -f 2 -k 3 -o output.jsonl"'
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/bin/mscclpp-test/allgather_test_perf -b 1K -e 1G -f 2 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/bin/mscclpp-test/allgather_test_perf -b 1K -e 1G -f 2 -k 1 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/bin/mscclpp-test/allgather_test_perf -b 1K -e 1G -f 2 -k 2 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/bin/mscclpp-test/allgather_test_perf -b 1K -e 1G -f 2 -k 3 -o output.jsonl"'
kill $CHILD_PID
workingDirectory: '$(System.DefaultWorkingDirectory)'

Expand All @@ -101,9 +101,9 @@ steps:
-O $SSH_OPTION 'sudo docker exec -t mscclpp-test bash -c "\
set -e; \
export PATH=/usr/local/mpi/bin:\$PATH; \
export LD_LIBRARY_PATH=/root/mscclpp/build:\$LD_LIBRARY_PATH; \
export LD_LIBRARY_PATH=/root/mscclpp/build/lib:\$LD_LIBRARY_PATH; \
cd /root/mscclpp; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/test/mscclpp-test/sendrecv_test_perf -b 1K -e 1G -f 2 -o output.jsonl"'
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/bin/mscclpp-test/sendrecv_test_perf -b 1K -e 1G -f 2 -o output.jsonl"'
kill $CHILD_PID
workingDirectory: '$(System.DefaultWorkingDirectory)'

Expand All @@ -124,15 +124,15 @@ steps:
-O $SSH_OPTION 'sudo docker exec -t mscclpp-test bash -c "\
set -e; \
export PATH=/usr/local/mpi/bin:\$PATH; \
export LD_LIBRARY_PATH=/root/mscclpp/build:\$LD_LIBRARY_PATH; \
export LD_LIBRARY_PATH=/root/mscclpp/build/lib:\$LD_LIBRARY_PATH; \
cd /root/mscclpp; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/test/mscclpp-test/allreduce_test_perf -b 1K -e 1G -f 2 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/test/mscclpp-test/allreduce_test_perf -b 1K -e 1G -f 2 -k 1 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/test/mscclpp-test/allreduce_test_perf -b 1K -e 1G -f 2 -k 2 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/test/mscclpp-test/allreduce_test_perf -b 1K -e 1G -f 2 -k 3 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/test/mscclpp-test/allreduce_test_perf -b 1K -e 1G -f 2 -k 4 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/test/mscclpp-test/allreduce_test_perf -b 12M -e 48M -i 3145728 2 -k 5 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/test/mscclpp-test/allreduce_test_perf -b 24K -e 768K -i 24576 -k 6 -w 100 -n 100 -o output.jsonl"'
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/bin/mscclpp-test/allreduce_test_perf -b 1K -e 1G -f 2 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/bin/mscclpp-test/allreduce_test_perf -b 1K -e 1G -f 2 -k 1 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/bin/mscclpp-test/allreduce_test_perf -b 1K -e 1G -f 2 -k 2 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/bin/mscclpp-test/allreduce_test_perf -b 1K -e 1G -f 2 -k 3 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/bin/mscclpp-test/allreduce_test_perf -b 1K -e 1G -f 2 -k 4 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/bin/mscclpp-test/allreduce_test_perf -b 12M -e 48M -i 3145728 2 -k 5 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/bin/mscclpp-test/allreduce_test_perf -b 24K -e 768K -i 24576 -k 6 -w 100 -n 100 -o output.jsonl"'
kill $CHILD_PID
workingDirectory: '$(System.DefaultWorkingDirectory)'

Expand All @@ -152,10 +152,10 @@ steps:
-O $SSH_OPTION 'sudo docker exec -t mscclpp-test bash -c "\
set -e; \
export PATH=/usr/local/mpi/bin:\$PATH; \
export LD_LIBRARY_PATH=/root/mscclpp/build:\$LD_LIBRARY_PATH; \
export LD_LIBRARY_PATH=/root/mscclpp/build/lib:\$LD_LIBRARY_PATH; \
cd /root/mscclpp; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/test/mscclpp-test/alltoall_test_perf -b 1K -e 1G -f 2 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/test/mscclpp-test/alltoall_test_perf -b 1K -e 1G -f 2 -k 1 -o output.jsonl"'
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/bin/mscclpp-test/alltoall_test_perf -b 1K -e 1G -f 2 -o output.jsonl; \
mpirun --allow-run-as-root -np 8 --bind-to numa -x MSCCLPP_DEBUG=WARN ./build/bin/mscclpp-test/alltoall_test_perf -b 1K -e 1G -f 2 -k 1 -o output.jsonl"'
kill $CHILD_PID
workingDirectory: '$(System.DefaultWorkingDirectory)'

Expand All @@ -177,7 +177,7 @@ steps:
set -e; \
cd /root/mscclpp; \
export PATH=/usr/local/mpi/bin:\$PATH; \
export LD_LIBRARY_PATH=/root/mscclpp/build:\$LD_LIBRARY_PATH; \
export LD_LIBRARY_PATH=/root/mscclpp/build/lib:\$LD_LIBRARY_PATH; \
python3 test/mscclpp-test/check_perf_result.py --perf-file output.jsonl --baseline-file ${{ parameters.perfBaselineFile }}"'
kill $CHILD_PID
workingDirectory: '$(System.DefaultWorkingDirectory)'
Expand All @@ -200,7 +200,7 @@ steps:
set -e; \
cd /root/mscclpp; \
export PATH=/usr/local/mpi/bin:\$PATH; \
export LD_LIBRARY_PATH=/root/mscclpp/build:\$LD_LIBRARY_PATH; \
export LD_LIBRARY_PATH=/root/mscclpp/build/lib:\$LD_LIBRARY_PATH; \
python3 -m pip install .; \
mpirun --allow-run-as-root -tag-output -x LD_LIBRARY_PATH=/usr/local/cuda/compat:$LD_LIBRARY_PATH -x MSCCLPP_HOME=/root/mscclpp -np 8 python3 ./python/mscclpp_benchmark/allreduce_bench.py"'
kill $CHILD_PID
Expand All @@ -223,9 +223,9 @@ steps:
-O $SSH_OPTION 'sudo docker exec -t mscclpp-test bash -c "\
set -e; \
export PATH=/usr/local/mpi/bin:\$PATH; \
export LD_LIBRARY_PATH=/root/mscclpp/build:\$LD_LIBRARY_PATH; \
export LD_LIBRARY_PATH=/root/mscclpp/build/lib:\$LD_LIBRARY_PATH; \
cd /root/mscclpp; \
./build/test/perf/fifo_test"'
./build/bin/perf/fifo_test"'
kill $CHILD_PID
workingDirectory: '$(System.DefaultWorkingDirectory)'

Expand Down
Loading
Loading