1x/ 2x/ 4x RPI 5 8gb Llama 7b/13b #17
serralva-ruben
started this conversation in
Results
Replies: 1 comment 2 replies
-
Nice! 3 tokens/second for Llama 13B. What switch/router have you used? |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
ps: I only ran each mode 1 time
1x RPI 5 8gb Llama 7b
🔶 G 420 ms I 419 ms T 1 ms S 0 kB R 0 kB Hello
🔶 G 443 ms I 434 ms T 9 ms S 0 kB R 0 kB world
🔶 G 446 ms I 434 ms T 12 ms S 0 kB R 0 kB !
🔶 G 443 ms I 434 ms T 9 ms S 0 kB R 0 kB The
🔶 G 406 ms I 405 ms T 0 ms S 0 kB R 0 kB weather
🔶 G 407 ms I 407 ms T 0 ms S 0 kB R 0 kB was
🔶 G 443 ms I 435 ms T 8 ms S 0 kB R 0 kB pleasant
🔶 G 440 ms I 432 ms T 8 ms S 0 kB R 0 kB in
🔶 G 448 ms I 435 ms T 12 ms S 0 kB R 0 kB the
🔶 G 445 ms I 436 ms T 8 ms S 0 kB R 0 kB morning
🔶 G 445 ms I 437 ms T 8 ms S 0 kB R 0 kB ,
🔶 G 438 ms I 430 ms T 8 ms S 0 kB R 0 kB and
🔶 G 446 ms I 438 ms T 8 ms S 0 kB R 0 kB the
🔶 G 447 ms I 434 ms T 12 ms S 0 kB R 0 kB sun
🔶 G 449 ms I 441 ms T 8 ms S 0 kB R 0 kB was
🔶 G 448 ms I 436 ms T 12 ms S 0 kB R 0 kB sh
Generated tokens: 16
Avg generation time: 438.38 ms
Avg inference time: 430.44 ms
Avg transfer time: 7.69 ms
2x RPI 5 8gb Llama 7b
🔶 G 291 ms I 257 ms T 34 ms S 1779278 kB R 522 kB Hello
🔶 G 263 ms I 228 ms T 35 ms S 590 kB R 522 kB world
🔶 G 301 ms I 257 ms T 44 ms S 590 kB R 522 kB ,
🔶 G 308 ms I 268 ms T 40 ms S 590 kB R 522 kB I
🔶 G 305 ms I 256 ms T 49 ms S 590 kB R 522 kB '
🔶 G 303 ms I 253 ms T 49 ms S 590 kB R 522 kB m
🔶 G 264 ms I 228 ms T 35 ms S 590 kB R 522 kB An
🔶 G 300 ms I 256 ms T 44 ms S 590 kB R 522 kB kit
🔶 G 305 ms I 257 ms T 48 ms S 590 kB R 522 kB D
🔶 G 304 ms I 257 ms T 47 ms S 590 kB R 522 kB w
🔶 G 343 ms I 301 ms T 41 ms S 590 kB R 522 kB ived
🔶 G 264 ms I 225 ms T 38 ms S 590 kB R 522 kB i
🔶 G 302 ms I 256 ms T 46 ms S 590 kB R 522 kB .
🔶 G 304 ms I 254 ms T 47 ms S 590 kB R 522 kB commits
🔶 G 266 ms I 234 ms T 32 ms S 590 kB R 522 kB
🔶 G 303 ms I 257 ms T 45 ms S 590 kB R 522 kB 4
Generated tokens: 16
Avg generation time: 295.38 ms
Avg inference time: 252.75 ms
Avg transfer time: 42.12 ms
4x RPI 5 8gb Llama 7b
🔶 G 182 ms I 143 ms T 39 ms S 2670078 kB R 784 kB Hello
🔶 G 180 ms I 135 ms T 45 ms S 2046 kB R 784 kB world
🔶 G 181 ms I 133 ms T 48 ms S 2046 kB R 784 kB !
🔶 G 181 ms I 132 ms T 49 ms S 2046 kB R 784 kB This
🔶 G 205 ms I 134 ms T 71 ms S 2046 kB R 784 kB is
🔶 G 223 ms I 168 ms T 54 ms S 2046 kB R 784 kB the
🔶 G 222 ms I 170 ms T 52 ms S 2046 kB R 784 kB brand
🔶 G 183 ms I 142 ms T 41 ms S 2046 kB R 784 kB new
🔶 G 223 ms I 169 ms T 53 ms S 2046 kB R 784 kB blog
🔶 G 223 ms I 169 ms T 54 ms S 2046 kB R 784 kB for
🔶 G 226 ms I 171 ms T 55 ms S 2046 kB R 784 kB the
🔶 G 222 ms I 173 ms T 49 ms S 2046 kB R 784 kB City
🔶 G 221 ms I 168 ms T 53 ms S 2046 kB R 784 kB of
🔶 G 220 ms I 166 ms T 53 ms S 2046 kB R 784 kB South
🔶 G 224 ms I 174 ms T 49 ms S 2046 kB R 784 kB F
🔶 G 216 ms I 171 ms T 45 ms S 2046 kB R 784 kB ult
Generated tokens: 16
Avg generation time: 208.25 ms
Avg inference time: 157.38 ms
Avg transfer time: 50.62 ms
2x RPI 5 8gb Llama 13b
🔶 G 476 ms I 437 ms T 37 ms S 3485724 kB R 818 kB Hello
🔶 G 459 ms I 416 ms T 43 ms S 924 kB R 818 kB world
🔶 G 498 ms I 445 ms T 53 ms S 924 kB R 818 kB !
🔶 G 497 ms I 441 ms T 55 ms S 924 kB R 818 kB I
🔶 G 499 ms I 441 ms T 58 ms S 924 kB R 818 kB '
🔶 G 458 ms I 411 ms T 47 ms S 924 kB R 818 kB m
🔶 G 495 ms I 440 ms T 54 ms S 924 kB R 818 kB Ch
🔶 G 496 ms I 440 ms T 56 ms S 924 kB R 818 kB ase
🔶 G 496 ms I 445 ms T 51 ms S 924 kB R 818 kB ,
🔶 G 498 ms I 449 ms T 48 ms S 924 kB R 818 kB the
🔶 G 495 ms I 442 ms T 52 ms S 924 kB R 818 kB Le
🔶 G 496 ms I 444 ms T 51 ms S 924 kB R 818 kB ad
🔶 G 497 ms I 442 ms T 55 ms S 924 kB R 818 kB Mark
🔶 G 533 ms I 444 ms T 89 ms S 924 kB R 818 kB eting
🔶 G 462 ms I 414 ms T 47 ms S 924 kB R 818 kB Special
🔶 G 493 ms I 439 ms T 54 ms S 924 kB R 818 kB ist
Generated tokens: 16
Avg generation time: 490.50 ms
Avg inference time: 436.88 ms
Avg transfer time: 53.12 ms
4x RPI 5 8gb Llama 13b
🔶 G 312 ms I 267 ms T 42 ms S 1036099 kB R 1227 kB Hello
🔶 G 330 ms I 238 ms T 92 ms S 3203 kB R 1227 kB world
🔶 G 331 ms I 270 ms T 61 ms S 3203 kB R 1227 kB ,
🔶 G 349 ms I 287 ms T 61 ms S 3203 kB R 1227 kB I
🔶 G 294 ms I 239 ms T 55 ms S 3203 kB R 1227 kB '
🔶 G 334 ms I 271 ms T 62 ms S 3203 kB R 1227 kB m
🔶 G 328 ms I 268 ms T 60 ms S 3203 kB R 1227 kB the
🔶 G 332 ms I 273 ms T 58 ms S 3203 kB R 1227 kB one
🔶 G 295 ms I 243 ms T 52 ms S 3203 kB R 1227 kB who
🔶 G 370 ms I 308 ms T 61 ms S 3203 kB R 1227 kB is
🔶 G 336 ms I 271 ms T 65 ms S 3203 kB R 1227 kB trying
🔶 G 336 ms I 275 ms T 60 ms S 3203 kB R 1227 kB to
🔶 G 335 ms I 272 ms T 63 ms S 3203 kB R 1227 kB win
🔶 G 336 ms I 268 ms T 68 ms S 3203 kB R 1227 kB your
🔶 G 337 ms I 273 ms T 63 ms S 3203 kB R 1227 kB heart
🔶 G 335 ms I 269 ms T 66 ms S 3203 kB R 1227 kB .
Generated tokens: 16
Avg generation time: 330.62 ms
Avg inference time: 268.25 ms
Avg transfer time: 61.81 ms
Beta Was this translation helpful? Give feedback.
All reactions