v0.2.4 为什么样例展示里 多并发比单并发的总速度要快 #1145
Unanswered
JennieGao-njust
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Request 3: Decode Speed = 2.80 tokens/s,First packet time = 44.05s
Request 1: Decode Speed = 3.00 tokens/s,First packet time = 40.20s
Request 0: Decode Speed = 4.07 tokens/s,First packet time = 79.58s
Request 2: Decode Speed = 4.02 tokens/s,First packet time = 79.30s 单并发测试是11tokens 几乎是平分的,为什么样例里原始但并发是17token/s,并发后是4*10tokens/s,
500GB内存 500GB硬盘 模型是UD-Q2_K_XL
cpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 52 bits physical, 57 bits virtual
CPU(s): 64
On-line CPU(s) list: 0-63
Thread(s) per core: 2
Core(s) per socket: 32
GPU 是L20
Beta Was this translation helpful? Give feedback.
All reactions