Skip to content
This repository was archived by the owner on Dec 1, 2024. It is now read-only.

Commit 4aa2661

Browse files
authored
Update Petals setup details
1 parent 15af128 commit 4aa2661

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

benchmark/batch_size_table.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,5 +31,6 @@ We attach the generation throughput here for reference.
3131

3232
### About Petals
3333
We also include [Petals](https://arxiv.org/abs/2209.01188) as an additional baseline.
34-
We normalize the throughput reported in the Petals paper (from the case of 14 real servers in Europe and North America) by the number of used GPUs and get an effective per-GPU throughput of 0.68/14 ≈ 0.05 token/s.
35-
For a more comprehensive comparison with Petals, see Section 6.4 in our paper.
34+
We measure the results of running OPT hosted on 1, 4, and 24 T4 GPUs (in case of 6.7B, 30B, and 175B respectively) on GCP.
35+
We perform 6 parallel requests to the system and divide the throughput by the number of used GPUs in each case.
36+
For a more comprehensive comparison with Petals, see Section 6.3 in our paper.

0 commit comments

Comments
 (0)