-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Description
Hi, thanks for your great work.
I am following the instructions to install and run the test scripts.
I tried two systems, one with 4xA100 40G, the other with 4xA100 80G.
I use the following command to run:
ENABLE_INTRA_NODE_COMM=1 torchrun --standalone --nproc_per_node=4 tests/longspec_benchmark.py --target checkpoints/togethercomputer/LLaMA-2-7B-32K/model.pth --model checkpoints/TinyLlama/TinyLlama_v1.1/model.pth --model_name /home/users/jyao/ai_general_research/yueying/MagicDec/checkpoints/togethercomputer/LLaMA-2-7B-32K --rank_group 0 1 2 3 --draft_ranks 0 1 2 3 --gamma 3 --B 4 --prefix_len 16000 --gen_len 64 --streamingllm_budget 256 --benchmark
Both systems will hang at some point:

For example, it stops here. The nvidia-smi shows

Metadata
Metadata
Assignees
Labels
No labels