Skip to content

Commit 2bb9145

Browse files
amitz-nvdominicshanshan
authored andcommitted
[https://nvbugs/5510879][fix] Fix pytorch & TRT-python flows fused LoRA adapter modules weight split with TP>1 (NVIDIA#8063)
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
1 parent a389863 commit 2bb9145

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

tests/unittest/llmapi/test_llm_multi_gpu_pytorch.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,15 @@ def test_llama_7b_multi_lora_tp2():
6262
cuda_graph_config=None)
6363

6464

65+
@pytest.mark.gpu2
66+
def test_phi3_lora_fused_modules_output_on_tp2_identical_to_tp1() -> None:
67+
check_phi3_lora_fused_modules_output_tp2_identical_to_tp1(
68+
LLM,
69+
# Disable CUDA graph
70+
# TODO: remove this once we have a proper fix for CUDA graph in LoRA
71+
cuda_graph_config=None)
72+
73+
6574
@pytest.mark.skip(reason="https://nvbugs/5560921")
6675
@skip_ray
6776
@pytest.mark.gpu2

0 commit comments

Comments
 (0)