-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Labels
good first issueGood for newcomersGood for newcomers
Description
Thanks for your excellent work! But i met some questions when i try to use your framework.
I try to run offloading.py and offloading_TP.py on RTX4090 * 4 machine. As shown in the figure below, the progress bar has not been updated for a long time, but the graphics card usage is close to 100%.
The command i used:
CUDA_VISIBLE_DEVICES=0,1 OMP_NUM_THREADS=48 torchrun --nproc_per_node=2 test/offloading_TP.py --budget 12288 --prefill 130048 --dataset gs --target llama-7B-128K --on_chip 9 --gamma 16 --target /TriForce/models/Yarn-Llama-2-7b-128k
Is there something wrong?
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomers

